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MAXIMUM LIKELIHOOD IDENTIFICATION AND OPTIMAL INPUT DESIGN 
FOR IDENTIFYING AIRCRAFT STABILITY AND CONTROL DERIVATIVES 

By David E. Stepner and Raman K. Mehra 

Systems Control , Inc, 


I 

INTRODUCTION 

Aircraft parameter identification is the process of extracting 
numerical values for the aerodynamic stability and control derivatives, 
and other subsidiary parameters (wind gusts, sensors errors, etc.) , 
from a set of flight test data (a time history of the flight control 
inputs and the resulting aircraft response variables). The field of 
identification is one that has been pursued by diverse interests for 
many years. The practical application of this work to aircraft 
flight testing has existed for over 25 years. In spite of the wealth of 
experience which has been accumulated in this span of time, important 
requirements still exist for improving the techniques for extracting 
stability and control derivatives. 

First, there exists today a greater need for stability and control 
derivatives. There are currently two principal requirements for the 
mathematical models that these coefficients provide. These are (1) to 
provide inputs to simulators*, and (2) to provide a basis for the 
design of flight control systems. A third potential may also exist. 
Because the stability and control derivatives define a given aircraft 
more uniquely than the response mode criteria such as those in the 


*This may include digital computer simulations, fixed and moving base ground 
simulators, and in-flight simulators such as variable stability aircraft. 
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Flying Qualities Military Specification MIL-F-8785 there is reason to 
believe that these parameters will ultimately play more of a major role 
in the design, testing, and certification of aircraft. 

Second, with the continuing advances in aircraft design and perform- 
ance capabilities, the ability to extrapolate wind tunnel test results 
is diminishing and the importance of flight testing is growing. This is 
aided by the new Department of Defense policy of building prototype air- 
craft and thoroughly flight testing them before a production commitment 
is made. 

The principal elements of the aircraft identification process 
(see Fig. 1.1) are: (1) the identification algorithm, (2) the flight 

control input and (3) the instrumentation. The ultimate success of 
the identification process is totally dependent on all three of these 
elements, not just one of them alone. This study was concerned with 
the first two points, namely, the development of a general advanced 
digital identification technique based on the maximum likelihood criterion 
and the design of control inputs which will enhance the ability to 
identify specific aircraft stability and control derivatives. Digital 
parameter identification techniques have already reached a stage where 
they are being used increasingly over analog matching techniques for 
extracting stability and control derivatives from flight test data. 

Systems Control, Inc., (SCI) under this present contract developed 
the maximum likelihood identification technique, which was used 
successfully to reduce data from flight tests where gusts were present. 

In such cases both the measurement noise and process noise statistics 
were identified. 

The importance of the control input in the identif iability of 
stability and control derivatives from response data has been apparent 
for a long time. Under this contract, SCI has developed and applied 
an efficient computational technique to design the optimal inputs for 
identifying specific stability and control derivatives. 
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External Disturbances 
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Aircraft Mode^Measurement 
System, External Disturbances 


FIGURE 1.1 THE INTEGRATED AIRCRAFT IDENTIFICATION PROCESS 




This report is organized as follows: 

# Section II includes the specific task objectives of this 
contract and a summary of the principal results. 

+ Section III discusses the background material for the identi- 
fication of aircraft stability and control derivatives. 

• Section IV describes, in detail, the SCI Maximum Likelihood 
Identification Method. The derivation is carried out for 
both linear and non-linear models with and without process 
noise. The relationship of the technique to the output 
error and equation error methods is described and the re- 
lated identif iability problems are discussed. Included also 
is a detailed description of the SCI Maximum Likelihood 
Identification Program. 

o Section V presents the results on the identification of the 
stability and control derivatives for several different 
aircraft and under a variety of noise conditions. This 
includes simulated data for a nonlinear model of an X-22 
VTOL, actual flight data from an HL-10 lifting body (linear 
model) and flight data containing gust effects for an M2/F3 
lifting body (linear model) . 

• Section VI covers the requirements and the background material 
relating to the design of inputs for aircraft parameter iden- 
tification. 

• Section VII describes the details of the theoretical develop- 
ment and the Computational technique of computing optimal 

inputs for identifying aircraft stability and control derivatives. 
Several examples for which analytical solutions for the optimal 
input exist are presented to illustrate the form which the 
optimal inputs take. Included also is a detailed flow diagram 
of the SCI Optimal Input Design Program. 
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• Section VIII presents numerical results showing the character- 
istics of optimal inputs, and comparing the performance of the optimal 
input with a doublet input of equal energy and duration. The 
results of a Monte Carlo simulation of the identification pro- 
cess for the short period longitudinal derivatives of a C-8 
aircraft are presented indicating a substantial advantage in 

using the optimal input, 

• Section IX states the conclusions based on the results 
of this study. 

• Section X contains recommendations for further work. 
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II 

OBJECTIVES AND SUMMARY OF RESULTS 


This section presents (1) a statement of the study objectives, 

(2) an outline of the SCI Maximum Likelihood Identification Method and 
the SCI technique for designing optimal inputs for aircraft stability 
and control derivative identification and (3) a summary of the principal 
results of this study. The section is self-contained and is intended 
to provide the reader with an overview of the report. 

2.1 Study Objectives 

The objectives of this study were two-fold. First, it was desired 
to further develop the Maximum Likelihood Identification technique, 
originated by SCI in 1970, to the extent that it would successfully process 
actual flight test data containing random flight disturbances. This 
processing was to include identifying the correlation function of the 
random disturbance and determining a model for its representation. The 
second objective was to develop the theoretical foundation and construct 
a computer program for designing flight test inputs which would enhance 
the ability to identify specified stability and control derivatives. 

To achieve the above objectives the following tasks were defined 
and completed: 

Maximum Likelihood Identification: 

(1) Investigate the effects of different sensitivity terms in 
the identification of stability and control derivatives 

of X-22 VTOL aircraft, from simulated data, for a nonlinear 
aerodynamical model 

(2) Check out completed ML identification algorithm on HL-10 
(Lifting Body) flight data, for a linear aerodynamical 
model, for which previous results had been obtained 
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(3) Identify stability and control derivatives of the M2/F3 
(Lifting Body) from flight data containing random disturbances, 

for a linear model for which previous identification attempts 
with output error method have not succeeded. 

(4) Investigate symptoms, causes and remedies of parameter 
identif lability and uniqueness problems. 

Input Design: 

Construct operational computer program, based on results 
of theoretical study, for designing optimal Inputs 

Perform a Monte Carlo simulation of the identification process, 
comparing the optimal input with a subop timal input, for a 
model of the C-8 linear longitudinal equations of motion. 

2.2 Maximum Likelihood Identification Technique 

For the last 20 years various techniques such as fourier analysis, 
analog matching, and the time—vector method have been used to extract 
numerical values for the aerodynamic stability and control derivatives 
from records of flight test data. It has only been in recent years that 
modern digital computer techniques have been proposed for this problem. 

One of the most successful of these computer techniques is the Maximum 
Likelihood method developed by Systems Control, Inc. This technique 
holds great promise for future identification problems involving new 
aircraft configurations (VT0L,ST0L), high angle of attack transonic 
flight regime, flight test data containing gusts and for aircraft with 
stability augmentation systems. 

In the most general case, the maximum likelihood identification 
technique is a combination of three steps: (1) Kalman filtering to estimate 

the states and generate a residual or "innovation” sequence, (2) a modified 
Newton-Raphson algorithm for the parameter estimates and (3) an algorithm 
to estimate the noise statistics (mean and variances of the measurement and 


Optimal 

( 1 ) 

( 2 ) 
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process noise). In addition, the maximum likelihood technique provides a 
lower bound on the variances of the parameter estimate, and models for 
the measurement and process noise disturbances. 

Under this contract the maximum likelihood identification technique 
has been applied to a variety of flight test data both simulated and 
real. The objective has been to exercise the technique as much as 
possible and to investigate the problems that arose. As each problem 
was solved, the specialized algorithm needed for its solution (if any) 
was added to the complete maximum likelihood identification program. 

The goal was to develop a set of general computer algorithms capable of 
dealing with problems that arise in the identification of aircraft 
stability and control derivatives. 

2.2.1 X-22 VTOL Simulated Data 

The first phase of the identification study was the processing of 
data from a simulation of the X-22 VTOL Aircraft. The longitudinal aero- 
dynamic equations of motion were nonlinear and the data contained both 
measurement noise and process noise. Each of the stability and control 
derivatives was expressed as first or second order polynomial expansions 
in terms of the longitudinal velocity. The objective was to identify 
23 of these expansion coefficients and the quantitative effect of in- 
creased noise power on the quality of the parameter estimates. 

The problems encountered were almost all associated with either the 
aircraft model or the control input sequence. It was discovered quite 
soon in the investigation that a simple step input did not sufficiently 
excite the aircraft modes to allow for the accurate identification of 
many of the derivatives. The use of a multistep input improved these 
parameters a great deal , because of the model structure the input and 
output noise sequences were correlated. Accounting for this correlation 
improved the parameter estimates by increasing the estimate error co- 
variances, thereby bringing many estimates to within one standard deviation 
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of the actual values. During earlier investigations, some of the 
sensitivity terms were not included in the identification algorithm. 

When these terms were added, however, the quality of the parameter 

estimates changed very little, although the computer time, per iteration, 
more than doubled. 

A compilation of the results indicated that for "low" measurement 
and process noise, the maximum likelihood identification technique was 
able to identify all of the expansion coefficients, except those for 
accurately. When the expansions were recombined to form the time- 
varying stability and control derivatives, the fit to the derivatives 

M o’ M q’ V M <$ * V V X 6» Z 0 was ver y S° od » the fi t to Z w was acceptable 
and only the fit to would be considered unsatisfactory. When "moderate" 

process noise was used all the estimates of the expansion coefficients 
degraded. 

2.2.2 HL-10 Flight Data 

The second phase of the identification study involved using the maximum 
likelihood technique in the output error mode to identify the linearized 
lateral stability and control derivatives from flight test data for an 
HL-10 lifting body. Although the technique had no difficulty in accurately 
fitting the observed data (p, r, <#> , g and n ), several of the derivative 

a 

y 

estimates had opposite signs from the wind tunnel/ theoretical derived 
values which were used as initial estimates. These incorrect signs could 
be attributed to any one of the following factors: (1) insufficient 

excitation of particular aircraft modes due to inadequate input or action 
of the SAS system, (2) the linearized dynamics not sufficiently accurate 
for the flight conditions of the data, or (3) correlated measurement 
noise due to instrumentation system dynamics. 

In an effort to correct these signs, a modification to the maximum 
likelihood technique was attemped. This modification was to add to the 
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likelihood criterion a quadratic term putting a weighted cost on the 
difference between the a priori parameter estimates and the latest estimates* 
Using the weights supplied by NASA Edwards FRC, this "a priori weighting" 
method resulted in the correct signs and only a slightly (10Z) degraded 
fit to the observed data, as long as the measurement biases were identified, 
in addition to the other parameters. 

2.2.3 M2/F3 Flight Data 

This third phase of the study involved extracting the linearized, 
lateral stability and control derivatives of an M2/F3 lifting body from 
flight test data containing gusts. Unlike the HL— 10 data which had been 
successfully processed earlier by the use of the Output Error technique 
neither a satisfactory set of stability and control derivative estimates 
nor a satisfacory fit to the observed data had not been obtained from 
the M2/F3 data. Using an approximation that the gust noise in the sideslip 
angle measurement was much greater than the measurement noise and the maximum 
likelihood method with a Kalman filter model to account for the process 
noise, an accurate fit to the observed data was obtained. However, as in 
the HL-10 case, some of the estimated derivatives had signs opposite to 
those of the a priori estimates. 

The a priori weighting method, which was used successfully on the 

HL-10 data, proved to be not useful on the M2/F3 data. Two other techniques, 

both dealing with identif iability problems, were investigated. The first 

technique involved fixing at their a priori values one or more of a set 

of unknown parameters whose effect on the observed data was very similar 

(e.g. parameters that appear as a sum) or any parameter which has neglible 

effect on the data. The best fits to the M2/F3 data was obtained with 

the derivatives L',L,L A ,N,N and all the 6 derivatives fixed, 
p r p p r r 

The quality of the fit, however, was below that obtained without any para- 
meters fixed. 

The other technique involved eliminating from the set of allowable 
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values for the parameter estimates those eigenvector directions about 
which very little Information could be gained from the data. These 
singular directions are associated with the smaller eigenvalues of the 
Information matrix. When applied to the M2/F3 data, three singular direct— 
tions were determined. Unlike the other two techniques of fixing parameters 
or a priori weighting, the fit to the observed data remained very good and 
most of the sign problems were corrected. It is felt that this method 
offers great promise in future applications. 

2.3 Optimal Input Design 

As was shown with the X— 22 simulated data, the use of the proper 
control input sequence can greatly improve the quality of the parameter 
estimates. This is done by maximizing the sensitivity of the system 
response to the unknown parameters to be identified. The optimal input 
to be used in system identification would therefore be one which optimizes 
some criterion based on the output sensitivity with respect to all the 
parameters to be identified. 

During the second part of this contract, a computer program was 
developed which determines, for an arbitrary linear system and an arbitrary 
selection of parameters to be identified, the optimal input for parameter 
identification. The two criteria for optimality used in this program are 

(1) maximum sum of the (squares of the) the output sensitivities and 

(2) maximum product of the squares of output sensitivities. The first 
criterion is related to the trace of the Fisher Information Matrix and 
the second criterion is related to the determinant of the same matrix. 

The Information Matrix itself is the inverse of the Cramer-Rao lower 
bound for the covariances of the parameter estimates. 

The only constraint put on the input is one of total energy. State 
and input amplitude constraints can be imposed indirectly by changing the 
input energy content. In addition algorithms have been added which will 


11 



specify the optimal input for a specified data length as well as investigating 
the frequency content of the optimal input. 

The major emphasis in the optimal design part of the contract was 
in two areas. The first was to investigate the properties of the optimal 
input with respect to frequency content, comparison with a subop timal 
input and the effect of parameter uncertainties. The second involved 
a Monte Carlo simulation of the identification process involving comparisons 
of an optimal input and a suboptimal doublet input for identifying the 
short period dynamics of a C-8 aircraft. 

2.3.1 Optimal Input for C-8 Aircraft Identification 

The optimal input for identifying the five stability and control 
derivatives associated with the short period longitudinal dynamics of a 
C-8 aircraft (assuming a priori wind-tunnel parameter values) was derived 
using the trace of the Information matrix criterion. When compared to 
the suboptimal doublet input of equal energy and duration the optimization 
criterion was almost 20 time as large. Frequency domain analysis inferred 
that the input was made up of a DC component to identify the gain para- 
meters (control derivatives) and a sinusoidal component at the system 
natural frequency. This was to maximize output signal power and optimize 
the identification of the stability parameters. It was further found 
that if the optimal input was determined based on estimates of the stability 
derivatives which were 10% in error, the qualitative character of the optimal 
input did not change, and the accuracy with which the parameters could be 
identified remained approximately the same. 

The last exercise of this first part was to determine the optimal 
input based on the second performance criterion viz. maximizing the product 
of the diagonal elements of the information matrix. Based on the value 
of the expected standard deviations for the parameter estimates, the optimal 
input determined from this second criterion was much improved over that 
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determined from the first. 


2,3,2 Monte Carlo Simulation 

The more realistic test for the optimal input is to use it under 
actual identification conditions, to determine if the statistics 
of the parameter estimates and computed information matrices match those 
predicted from a priori analysis. A four state linear model of the full 
08 longitudinal dynamics was used in generating the simulated flight 
data, with the control input designed to identify only the five short 
period dynamics. In addition, the control input was designed with each of 
the short period stability and control derivative changed by 50% from the 
values used in the data generation. This was to model the situation where 
the control sequence for aircraft identification is detemined from a 
priori wind tunnel or theoretical derivative values. 50 complete identi- 
fication runs were made both with the optimal input and with a suboptimal 
doublet input of equal energy and duration. 

The parameter estimates from each run were compiled and total results 
evaluated based on 50 runs. With all measures of comparison, the optimal 
input greatly out performed the suboptimal input. Histograms of the 
parameter estimates were also compiled and compared. The results after 
50 runs closely matched the results predicted by the Cramer-Rao lower 
bound. Experiments were also run by modifying the optimal and the doublet 
inputs through the servo transfer functions. The use of optimal inputs 
in flight testing for the determination of aircraft stability and control 
derivatives appears, therefore, to be a very useful and powerful tool. 
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Ill 

BACKGROUND FOR AIRCRAFT PARAMETER IDENTIFICATION 


Although extensive time and effort, over a period of the last 20 years, 
has been expended in the development of more exact aircraft stability and 
control derivative identification techniques, up until recently, the extrac- 
tion of these derivatives from flight data remained a difficult and time- 
consuming problem. An emphasis on working directly with flight data, in 
addition to dealing with wind tunnel tests or theoretical calculations, has 
evolved as a result of what is often gross disagreement between wind tunnel 
and flight test derivatives, as well as the known difficulties of obtaining 
dynamic derivatives and extrapolating them to full scale derivatives from 
the wind tunnel values. 

There have been many methods proposed and tried for extracting stability 
and control derivatives from flight data. Most of these have proved to be 
successful only under idealized conditions such as no wind gusts or modeling 
errors and known instrumentation accuracies. Very often a good deal of the 
data collected during a flight test program has to be discarded for lack of 
a technique which is general enough to process it under less than ideal con- 
ditions. 

The emergence of the digital techniques during the past few years, 
resulting in the development of the Maximum Likelihood Identification Techniques, 
has given rise to the realization that much of the previously discarded data 
can be successfully processed. As the limitations of the instrumentation system, 
flight control input and inadequate aerodynamic model are recognized and com- 
pensated for, and the presence of wind gusts is included In the model structure 
and accounted for in the identification algorithm, the best set of identified 
values for the stability and control derivatives can be obtained. 
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This section will discuss several of the previous identification 
techniques which have been used to process flight data. This discussion 
will bring out the similarities that exist among these methods, and mention 
the aircraft flight data to which the methods have been applied. The next 
section, then, will provide a detailed explanation of the Maximum Likelihood 
Technique to be applied later on to flight test data containing gusts. 

3.1 Previous Identification Methods 


Although a large number of identification methods have been used in the 
past, only some of the more common methods which are currently in use will 
be described. 

3.1.1 Time Vector Method 


The time vector methods for derivative identification is derived from the 
time-invariance of the amplitude and phase relations between the state 
variables (degrees of freedom) of an exponentially damped second order system 
and the derivatives and integrals of the state variables. This invariance 
is used to determine the values of the amplitude-phase relations, thereby 
determining the aircraft stability and control derivatives (Ref. 1). 

When more than one state variable (degree of freedom) is involved in the 
transient response, and there is a common natural frequency, the instantaneous 
value of any one state may be readily determined if the characteristics of any 
one of the motions are known, along with the amplitude ratio and phase angle 
relative to the characteristic motion. The time invariance of the amplitude 
ratios and their phase angles permits the representation of any one of the 
linearized equations of motion as vectors. The properties of these vectors, 
plus the requirement that the vector sum of the quantities in any one equation 
equal zero, makes possible the determination of two unknown derivatives in any 
one equation. 
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As Ref. 2 points out, the time vector method has the principal 
disadvantage that it can only be applied to control-fixed, transient- 
oscillation aircraft responses with damping ratios less than-. 3. Further- 
more, the successful application of the time-vector method is highly de- 
pendent on the operators’ skill. 

3.1.2 Analog-Matching Methods 

The analog matching technique is actually an output error method since 
it strives to iteratively minimize the errors of the various responses through 
operator manipulation of the values of the stability and control derivatives. 

It is often used as a backup method for validating the more modern digital 
techniques. However, there are several disadvantages to the analog match- 
ing technique. First, the method works most successfully only when a 
single control surface is moved at a time and then only when the maneuvers 
are simple. (Ref. 1). Second, when the maneuvers are made with a stability 
augmentation system or other form of dependent control input, the data is 
difficult to analyze. Finally, this method is extremely time consuming, 
even in face of the fact that recent procedures, through the use of hybrid 
computers, has reduced the time considerably. For example, the time in- 
volved in analyzing a lateral-directional flight maneuver, from receipt of 
flight data to final results, is approximately four hours (Ref. 2); the analog 
matching technique is also extremely susceptible to uniqueness problems 
since the success of a data analysis is very dependent upon the type of 
control maneuvers used. In such a case the skill and knowledge of the operator 
would play an important part in determining the ultimate success of the analysis 

The analog matching technique has been used by the Air Force Flight Test 
Center (Ref. 3), the Naval Air Test Center (Ref. 4), and the NASA Flight Test 
Center (Ref. 5) for the F-104. X-15, B-70, HL-10, M2/F3, X-24 and PA-30 aircraft 
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Host of the remaining identification techniques, almost all of which 
require the use of a digital computer, can be classified as either 

1. Equation error methods, 

2 • Output error methods, or 

3. Advanced methods. 

These methods differ by (1) the performance criterion that they are 
developed from, (2) the kinds of estimates they produce, and (3) the problems 
to which they can be applied. 

3.1.3 Equation Error Methods 

Equation error methods (Ref. 6) assume a performance criterion 
that minimizes the square of the equation error (process noise) . All of 
these methods are basically least squares techniques and, in general, it is 
necessary to measure all the response variables and their derivatives. The 
procedure is to express the stability and control derivatives as functions 
of the measured responses using the equations of motion. This results in 
n or more linear equations in n unknowns. For those cases where the time 
derivatives are not measured, various f, method functions" are used to operate 
on these equations (take time derivatives, Laplace or Fourier transforms, 
etc.) to obtain equations that are linear in the unknown stability and control 
derivatives. Since these methods do not allow for measurement errors (instru- 
mentation errors), they result in biased estimates when this type of error 
does exist. The principal use of these methods are as start-up techniques 
for the output error and advanced methods. 

The equation error methods have been used or are being used by Cornell 
Aeronautical Laboratory (Ref. 6), Air Force Flight Test Center (Ref. 3), 
and Delft University of Technology (Ref. 7). 
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3.1.4 Output Error Methods 


Output Error Methods (Refs. 8 through 17) minimize the square 
of the error between the actual system output and the output of 
a model used to represent the actual system. This method assumes measure- 
ment noise but no process noise. Typical output error methods include 
Newton-Raphson, Gradient methods, the Kalman Filter (without process noise), 
and modified Newton-Raphson, differential correction, and quasilinearization 
(all three of which are the same method) . 

The modified Newton-Raphson method has been used extensively in flight 
test applications for the past several years. It is the one method that has 
been used on an operational basis and for which the most experience exists. 
This method has been or is being used by (among others) : (1) the NASA Flight 
Test Center (Ref. 5) on the X-24, X-14, XB-70, 990, HL-10, M2/F3, Jet 
Star and PA-30 aircraft; (2) the NASA Ames Research Center (Reference 18) 
on the Learjet, XV-5, 990 and the C-8 aircraft; and (3) the NASA Langley 
Research Center (Refs. 19,20) on the XC-142, Navion and F-4 aircraft. 

(NASA Langley program has automatic update of the weighting matrix based on 
the maximum likelihood criterion.) 

The principal disadvantages of the output error methods is that, because 
they do not include process noise in their performance criterion, the results 
degrade when process noise (gusts, modelling errors) exists. This may result 
in the computer program not converging or in estimates that have large var- 
iances or poor estimates (Ref. 21). However, as long as these methods 
are applied to linear flight regions, or where the form of the equations is 
known, or where gusts do not exist, they work very well. 

3.1.5 Advanced Methods 


The most general identification problem is one of extracting stability 
and control derivatives, for non-linear aircraft models, from flight data 
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containing both measurement and process noise. The one advanced technique 
that has demonstrated the capability of extracting stability and control 
derivatives from flight data under these circumstances is an Implementation 
of the maximum likelihood criterion . This numerical algorithm, developed 
by SCI, is a combination of three steps: (1) Kalman filtering to estimate 

the states and generate a residual sequence, (2) a modified Newton-Raphson 
algorithm for the parameter estimates, and (3) an algorithm to estimate the 
noise statistics (means and variances of the measurement and process noise) • 

The details of the numerical method are outlined in the next section. 

The success of the SCI maximum likelihood technique can be attributed 
to several important attributes of this method: 

1. It does not require a priori knowledge of the process noise 
covariance, measurement noise covariance or the initial parameter 
estimate covariance. These covariances are determined as part of 
the Identification procedure. 

2. When process noise does not exist, this method simplifies to the 
modified Newton-Raphson output error method (although with a spe- 
cific weighting matrix) . 

3. When no measurement noise exists (an unlikely event) this method 
simplifies to the least squares equation error method. 

4. The Cramer-Rao lower bound on the covariance of the error in the 
stability and control derivative estimates are obtained as part 
of the algorithm. 

5. The minimum mean-square aircraft state variables (response variables) 
are obtained as an integral part of the algorithm. It is not re- 
quired, however, that initial state estimates be supplied. 

The following section gives a detailed derivation of the Maximum Likelihood 
Identification Method. 
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IV 


MAXIMUM LIKELIHOOD (ML) IDENTIFICATION 


The notion of the maximum likelihood estimate which was introduced 
into statistics by R. A. Fisher In 1906 is based on a relatively simple 
idea. Assume that the outcome Z of an experiment depends on an unknown 
parameter 0. We want to infer the best value of 0 from the observation 
Z. One answer is to choose that value of 0 which makes the observed 
value Z the most probable one to have occurred. This can be rigorously 
stated as: choose 0 to maximize the conditional probability of Z, given 

a value of 0 ; i.e. 

0 83 max p(z[ 0) 

0 

where 0 is the maximum likelihood estimate of 0 and p(z|0) is the 
conditional probability of Z, given 0. The same estimate is obtained by 
maximizing log p (Z | 6) which is known as the likelihood function. 

The above basic idea can be carried over to linear and nonlinear 
dynamic systems, with process and measurement noise, but the details pf the 
application become quite involved. In practice, there are two major pro- 
blems in obtaining ML estimates for dynamic systems. These are: 

1. Deriving an expression for the likelihood function, and 

2. Maximizing the likelihood function with respect to the unknown 
parameters. 

These two problems are elaborated upon further v The likelihood function 
is the logarithm of the joint probability density of the observations given 
the parameters. If the observations are independent, the joint probability 
density function is easily written down since it is just the product of the 
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probability densities of each observation given the parameters. The deriva- 
tion of the likelihood function becomes much more difficult when the obser- 
vations are correlated. This is necessarily the case for dynamic systems 
with random inputs since the state at any time is correlated with the state 
at all the previous times. In the next section, it is shown how the likeli- 
hood function for a dynamic system can be derived in a simple form using a 
Kalman filter and the resulting white noise innovation sequence. This is 
shown schematically in Figure 4.1. 


The second problem in obtaining ML estimates is a computational one. 
Generally, the likelihood function is highly nonlinear in terms of the para- 
meters. For finite data lengths, it is also known to have several local 
maxima. In the case of dynamic systems, certain differential equation 
constraints have to be satisfied. The choice of a suitable search algorithm 
is very important for the successful application of ML identification. 

The maximum likelihood identification method, as implemented by SCI, 
is an adaption and extension of the recent work of Astrom (Ref. 22), 

Kashyap (Ref. 23) and Mehra (Ref. 24). It is capable of solving the most general 


Measurement Noise 



FIGURE 4.1 IMPLEMENTATION OF MAXIMUM LIKELIHOOD ESTIMATOR 
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Identification problem, including (1) systems governed by non-linear 
differential equations of motion, (2) the presence of additive random 
process noise in the equations of motion, and (3) random disturbances 
corrupting the measurements of the system inputs and outputs. In this 
section the development of the theory justifying the use of the likelihood 
function as an optimizing criterion for identification is developed and 
the numerical algorithm for implementing the maximum likelihood identification 
method is outlined. The development is first given for linear systems and 
then extended to nonlinear systems. 

4.1 Linear Systems: 

Consider the linearized aircraft equations of motion 
x - Fx + Gu + rw 
where 


x(t) = n x 1 state vector (p, q, r, u, v, w, etc.) 

u(t) - p x 1 input vector (6 e , 6 a , 6 r ) 

w(t) = q x 1 vector of random forcing functions 

Let the measurement equations be 

vhere y(t) « Hx(t) + Du(t) + v(t) 

y(t) = r x 1 output or measurement vector 
v ( t) =* r x 1 vector of random measurement errors 

and 

E { w(t) } = 0, E { w(t)w T (-r) } - Q 6(t - t) 

where 6(t - x) is the Dirac Delta function. 

E { w(t)v T (x) } = 0 

E { v(t) } “ 0, E { v(t)v T (x) } = R6 ttX * 


22 



It is assumed that the structure of the model is known. The 
vector of unknown parameters in F, G, T, H, Q, R and x(o) are denoted by 
0. Thus 0 includes all the unknown stability and control derivatives, 
noise variances and the initial states. 

If a sequence of observations y(l) ,y(N) is made of the 

system state with noise, the maximum likelihood estimate of 
0, following the idea stated earlier, is given by 

0 = max p(Y N |0) (4.3) 

0 

where = {y(l), , y(N)}. With successive applications of Bayes 

Rule, an expression for p(Y N |0) can be derived as 


p(Y N /0) ® p(y(l),... y(N)/0) 

- p(y(N)|Y N _ r 0)p(Y N _ 1 |0) 

“ P(y(N)lY N _ r 0)p(y(N-l){Y N _ 2 , 0 )p(Y n _ 2 |0) 


N 

" n p(y(j)lY , o) 
j-l 2 ~ l 


Since the logarithm is a monotonic function, the maximum likelihood estimate 
can also be written as 

=* max 
0 

where log p(Y^|0) is the likelihood function. 


0 “ max [log p(Y N |©)] 


-11 - 

2 ^ log p(y(j) lYj^.0) 

- 4=1 


(4.4) 
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If x(o), w(t) and v(t) are normally distributed, p(y (j ) !Y j_^,0) 

will also be normal and can be uniquely determined by computing the mean 
and covariance. Therefore define 

0} - y(j|j-i) 

and 

cov{y(j)|y , 0} = E{(y(j) -y(j|j-D) (y(j)-y(j | j-l)) T > (4.6) 

“ B(j J j-1) 

With these assumptions, the term log p (y(j)|Y 1 0) can be written as 

J“ J *» 

log pfoOlY^ 0) =* Const.- -|-(y(j)-y(j|j-l)) T B(j I j-1) (y(j)-y(j [j-1)) 

- log |B(j | j-1) | (4*7) 

The problem of determining the maximum likelihood estimate has now 
become one of finding a way of calculating the conditional mean, y(j|j~l), 
and the error covariance , B(j|j-1). These quantities, however, are precisely 
the output of a Kalman Filter (Ref, 25) state estimator, given 0. This filter 
is designed to recursively process measurements one at a time, and, at each 
point produce the minimum variance state estimate based on all the data 
received up to that point. 

The Kalman filter prediction and update equations can be derived as 
* 

follows : 

*For a more rigorous derivation of the Kalman Filter, see either Kalman (Ref. 26) 
or Kailath (Ref. 27). Also note that the conditioning on 0 has been omitted 

from the equations to simply notation. 
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Initial Conditions ; The Kalman filter is started with a priori 
state estimate it (0 1 0) and covariance P(0j0). 

The state prediction is done using the equations of motion and 
state update is done using the measurements. 

Prediction Equations : At time (j-1), the Kalman filter has a state 

estimate x(j-l|j-l) and covariance P(j-l|j-l). It is required to 
predict the state at time j , The resulting state estimate is denoted 
b y and has a covariance of P(j|j-1). The relationship be- 

tween the updated and the predicted estimates can be obtained by taking 
conditional expectations on both sides of Eq. (4.1) and interchanging 
the operations of expectation and differentiation. This gives 


(j[ A . ~ 

x(t|j-l) - Fx (t | j-1) + Gu(t) 

(j-1) £ t _< j 


where the predicted value of the white noise w(t) based on previous infor- 
mat ion is equal to zero. 

The covariance equation for the predicted estimate x(t/j— 1) 
can be obtained by subtracting Eq. (4.8) from Eq. (4.1) and using 
the covariance propagation equation derived in Bryson and Ho (Ref. 25). 

df P(t|j-1) = F P(t|j-1) + P(t|j-1)F T + TQr T (4.9) 

U pdate Equations: The update equations for the Kalman filter can be 

derived using a well-known property of the conditional normal distribu- 
tions (Ref. 28), viz.. 


E { a|b ) - a + P^P-J (b - b> 


(4.10) 


cov(a| b) - P - p p } p T 
aa ab bb ab 


(4.11) 
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where a and b are normal random variables with 


E{a> ■= a , = b 

cov(a) - P aa , cov(b) - P feb 

E{ (a-7) <b-b) T } - P ab 


Replacing a by x(j) with mean a - x(j|j-l) and covariance 

p ■ P(j|j~l) and replacing b by y(j) with mean 
aa 

b * y(j|j-l) " Hx(j|j-1) and covariances 


P bb ’ 

HP(j | j“l) H T + R 

(4.12) 

P ab = 

P(j|j-1) h t 

(4.13) 

we obtain. 

x(j 1 j) 

- X 0 1 j-1) + K(j) (y(j) - Hx(j|j-1)) 

(4.14) 

K(j) ■ 

- P(j|j-1)H T (HP(j|j-l)H T + R) -1 

(4.15) 

and 

p(j|j) 

= (I - K(j)H) P(j|j-1) 

(4.16) 


The quantity (y(j) - y(J | j-1) ) represents the new information 
brought forth by the measurement y(j). It is known as the "innovation" 
sequence and has been shown to be zero mean, Gaussian and white (Ref. 29) 
Denoting the innovations by v(j), the likelihood function can be 
written as 

N 

log p(Y |0) - - \ Y { v T (j)B _ 1 (j|j-D V(j) + l 0 g|B(j|j-l)|> (A. 17) 

j-1 
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where 


B(j|j-1) - HP(j|j-l)H T + R 


(A. 18) 


The maximum likelihood estimate 0 is obtained by maximizing 

(4.17) with respect to 0, subject to the constraints In equation 

(4.8)-(4.9), (4.14)-(4.16) . This is a very difficult optimization 
problem. An approximation suggested in Ref. 24 simplifies the problem 
tremendously. It is assumed that the filter gain K(j) and covariance 

have reached constant values K and B. The vector 0 of un- 
known parameters Is now defined to include (in addition to F,G) K and 
B instead of Q and R. Reference 57 gives a detailed derivation of the 
relation between K, B and Q, R. Then 

N 

log p(Y n |0) - - | J{v T a) B " 1 v(j) + log|B|] (4.19) 

j"l 

Maximizing (4.19) over B, produces 

N 

B “ s 2 '•a I S)v T cj | (4-20) 

3-1 


where a is the ML estimate of unknowns of F, G and K. It is given by 
the root of the equation 


J v T (j)B" 1 


Mi) 

3 a 


0 . 


(4.21) 


, 3v(1) 

where - ^ i s calculated from Eq. (4.8) - (4.18). The root of (4.21) 
is found by a Newton-Raphson iteration. Once a is obtained, R and Q 

can be obtained from equations (4.9) and (4.18). In this way the non-linear 
constraints imposed by the equations (4.9), (4.15), (4.16) and (4.18) are 
avoided during optimization. 


4.2 Nonlinear Systems 

The approach to obtaining the maximum likelihood parameter estimates 
for nonlinear models is conceptually similar to that for linear models. 
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Consider a nonlinear dynamic system model of the form 


(A. 22) 
(A. 23) 


i(t) - f(x(t), 0, u(t)) + Tw(t) 

y(t) » h(x(t)) + v(t) 
where f(‘) and h(‘) are n x 1 and r x 1 vectors of nonlinear functions. 
Also, w(t) and v(t) are Gaussian white noise sequences with zero mean 
and covariances Q and R. 

The evaluation of the exact maximum likelihood estimate involves the 
calculation of the conditional probability p(yO)|Y j _ 1 ^) as in the linear 
model case. This would require an optimal nonlinear filter, which, to date, 
is computationally unfeasible since a complete description of p<y(J) 1*^,6) 
requires computing all its moments. As a result, it is proposed to use an 
Extended Kalman Filter (Ref. 30) of the following form: 

r = f(x(t|j-i), e, u(t)) (4 ‘ 2 

*(j|j) = x(j|j-l) + K(j)vCj) < 4 * 2 

v(j) - y(j) - h(x(j [j-D) (4 * 2 

The Kalman gain K(t) is calculated from equations (A.13)-(A.l6) by 
using the time varying matrices H and F, defined by 

H(t) = |j| 

3 x = x(j | j-1) 

«« - Hi - . 

x “ X ( j I j ) 


(A. 27) 
(A. 28) 


Notice that the Extended Kalman filter linearizes the equations around 
the latest best estimate of the state. More advanced filters such as Second 
Order Filters, Single Stage Smoothing Filters, etc. (Ref. 31) can be used for 
state estimation, but for the aircraft parameter identification problem, when 
all the states and accelerations are being measured accurately, an Extended 
Kalman filter comes quite close to the optimal nonlinear filter in accuracy. 
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Kailath (Ref. 27) has shown that the density of the Innovation V(t) tends to 
a Gaussian density as the sampling rate is increased. Thus, for high sampling 
rates the likelihood function can again be written as 

N 

j = log(Y N |0) - ]> v T (j) B _1 ( j ) v ( j ) + log | B (j ) | (A. 29) 

j=l 


The validity of the above two assumptions viz: high sampling rates and accurate 
measurements should be checked in practice for each application of this 
method. 

Remark : 

The use of an Extended Kalman filter here is for state estimation only. 

It is also possible to use an Extended Kalman Filter for simultaneous state 
and parameter estimation (Refs. 21* 32). In the authors 1 opinion* this is not 
desirable since the uncertainties in the states are much smaller than the 
uncertainties in the parameters. Therefore, the assumptions of lineariza- 
tion which are valid for state estimation are generally not valid for para- 
meter estimation in the aircraft parameter identification problem. Moreover, 
the Extended Kalman Filter for simultaneous estimation of the state and the 
parameters assumes knowledge of the a priori covariances which are unknown 
for the parameters. This is one of the reasons why an Extended Kalman 
filter typically gives unreliable confidence limits on the parameter 
estimates (Ref. 21), The maximum likelihood method described here will be 
shown to provide realistic estimates of confidence limits on a test case. 

It has also been found to converge in several cases where the Extended Kalman 
Filter for simultaneous state and parameter estimation failed to converge 
properly due to poor a priori values for the parameters. 
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4.3 Numerical Optimization Algorithm 


The optimization algorithm described here for obtaining the maximum of 
the likelihood function is the Modified Newton-Raphson or Quasilinearization 
Method t Determining an update to a set of parameter estimes 6 , which will 
decrease the value of the likelihood function (cost) , using this method in 
volves computing two matrices : the gradient of the cost with respect to the 

unknown parameters, , and the information matrix, M. 


For the case of a nonlinear system with process noise, the likelihood 
function, J, is computed using an extended Kalman filter. The gradient and 
information matrix computation must therefore include at least the first order 
partials of the Kalman gain with respect to the parameters. With the nonlinear 
system model given by Equations (4 .22)-(4 . 23) , and the extended Kalman filter 
by Equations (4 .24)-(4 . 28) , the gradient of J with respect to 6 is given by 


where 


and 


39 


3v(.1) 

39 

mu 

39, 


j=i 




„, tl aja I j-D Sh 


80 

,T 


P(j |j-l) H T (j) + H(j) L 

30 k k 

,T/ 


+ H(j> taij-i) 4^ 


3R 

36 ,. 


(4.31) 


* Based on prior experience (Ref. 33), the convergence of other optimization 
algorithms, including conjugate gradient and Davidon method, has been 
found to be slower than that of the quasilinearization method. 
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3P(1 1—1) 

The recursive equations for — — are obtained directly from the 

Riccati equations, with the separation into update at the measurement times 
and prediction between measurement times again being made. 


Prediction : 

From Equation (4,9), the prediction equation for (j-1) <_ t j 
becomes 


3 L fe |j r i). . 3F ( | . ,v F 3P(th-i) + ap(tli-i). F T 

30 , 30 , ^ 13 * 30 . 30 , 

k k k k 


Qr T + rfp- + r Q ir 


'k 


30 


(4.32) 


k 


Update ; 


.th 


The update equation, at the j measurement point, is obtained 
from Equation (4.16) 

3P a ^ J) - (I-K(j)H) ^ - (i L )- . 1 . ) . - iMil HP(jlj-l) 


30, 


30. 


-K (j) |f P(j|j~l) 


30 


(4.33) 


where 


3K(H = , 3P( j l j-1) , h T [Hp(j | j.!) h T + R] -1 


30, 30, 

k k 


+ PO I j-D H [HP(j|j-l) H T + R]' 


- K(j)["|| p(j|j-i) h t + h 9Pliibil H T 

L k k 

j 




(HP(j|j-l)H T +R) -1 


(4.34) 
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defined as the sensi- 


3x(1 |1-1) 

The recursive equations for the term — S 

tivity equations and appearing in Equation (4.31), are obtained from the 
update and prediction filter equations (4. 24)- (4. 25) ♦ 


Prediction: 


3x 

30, 


3f 

36, 


F <*> t 


j-i < t < j 


(4.35) 


Update : 


3*(ll.1) 

30 k 


iMiltii + 11 m V(J) . K«) [h 0) gj 


(4.36) 


These same sensitivity equations are used to compute the information 
matrix, which is given by 

n 

B-^j) /(j) B-O) 51 

'k 


ae k 3e £ 


= E ti- T(j) B_1(j) - jI * T <J> B- X a) ff< J) B- l a)fe 0) 

* * o k £ 


j-1 


- v T (j)B' 1 (j) |f«> B _1 (j) & 0> 


30, 




(4.37) 


Note that second order partial terms and several first order partials (of 
matrix inverses) have been neglected. All the partial derivative terms appear- 
ing in Equation 4.37 can be obtained via Equation 4.31. 
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The update to the parameter estimates 0, or step size A0, is then com- 
puted using the following equation 


-«- 1 ( i ?) 1 


Since M is the Fisher Information Matrix, M _1 provides the Cram€r-Rao 
Lower Bound on the covariance of the 0 estimates. The ML method approaches 
this lower bound asymptotically. 

4.4 Relationship to Output Error and Equation Error Methods 

As stated earlier , one of the principal advantages of the maximum 
likelihood method is that, under special circumstances, it reduces to the 
output error or equation error method, both of which have been widely used 
for extracting stability and control derivatives from flight test data. 

For the case where there is no process noise present, i.e., w(t) = 0, 
the process noise covariance, Q(t), is identically zero. With P(0) either 
equal to zero, (if the initial state estimates are known) or small, this 
implies P(t|t-1)=:0 for all t after some initial transient (see Equations 
(4.9 )-(4.16)). The Kalman gain will then also be identically zero (Equa- 
tion (4.15) and the innovations sequence reduces to 

v(t) = y (t) - Hx(t) 
for the linear case, and 


v(t) = y(t) - h(x(t) ) 


in the nonlinear case. In both cases v(t) is exactly the output error. 

The only difference then between the maximum likelihood method and the more 
classical output error method is the choice of the weighting matrix. In 
the maximum likelihood method, it is given as R, the measurement noise 
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covariance! or its estimate. In the maximum likelihood method, is is 
chosen as 

N 

* I \ ^ * T 

R - ^ 2 ^ v(j,0) v (j ' 0) * 

j-1 

For the case where no measurement noise exists, the measurement noise 
covariance R(t) is identically zero. For the case in which all the states 
and their time derivatives are measured, the likelihood function is the sum 
of squares of the equation error at sampling times. Thus, the ML estimates 
are identical to the equation error estimates. 

4.5 Identifiability and Uniqueness Problems in Extraction of Stability and 

Control Derivatives 

Although the maximum likelihood method discussed in the previous section 
represents one of the most advanced identification techniques developed to 
date, there still remain some basic problems associated with extracting 
stability and control derivatives from flight test data. Most of these pro- 
blems can be classified under the heading of "identifiability," which is re- 
lated to the degree of excitation for the particular modes of the system under 
investigation and the ability to identify the associated parameters. Identi- 
fiability also relates to whether the parameters themselves can be identified 
or whether they can only be identified as part of a linear combination. This 
section will discuss some of the symptoms and causes of identifiability pro- 
blems, and a few of the methods which have been used to solve them. 

4.5.1 Symptoms and Causes of Identifiability Problems 

The most obvious symptoms of identifiability problems are physically 
nonrealizable parameter estimates and large associated error covariances. 
Either of these symptoms may arise for a number of different reasons. If 
the input sequence does not adequately excite some of the modes, or if the 
Stability Augmentation System is operating, thereby suppressing some of the 
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aircraft modes, the associated parameters may not be identifiable. If the 
model chosen to get the input-output data is inadequate, the parameters of 
that model may be forced to account for some major unmodeled effects. The 
estimated parameter values may, therefore, be quite different from what aero- 
dynamic theory and previous results may indicate. If there are large, un- 
accounted for instrumentation errors or errors in the location of the c.g. 
and the sensors, again non-physical parameter values may result. Finally, 
such additional factors as too short a data length, local minima in the cost 
functional and poor initial parameter estimates may also result in non- 
physical parameter values. 

Large error covariances principally result from poor input sequences 
and attempts at identifying too many parameters. The first factor reduces 

the sensitivity of the ouput to variations in some parameter values, and 
the second factor causes linear dependencies between parameter estimates. 
Since an extraneous parameter in the model does not, by definition, improve 
or degrade the fit to the observed data, its estimated value will be of 
no significance and the error covariance of the estimated value will be 
large. 


Probably the most common ident if lability problem encountered in 

processing flight data results from parameter dependencies. This may 

occur through a pair of parameters which always appear in the equations 

of motion together, as with and C , or through a poor choice of 

q a 

inputs such that some of the aircraft response variables are linearly 
correlated, or it may occur through an overspecification of the number 
of parameters to be identified. In each case, the result of the 
dependencies is a nearly singular information matrix, which when inverted 
to obtain the step size in the parameter estimates, causes numerical 
problems. 

Additional numerical problems associated with a nearly singular 
information matrix arise when the control input is expressible as a linear 
combination of the aircraft response variables. This matrix singularity 
results from the linear dependence between the partial of a response 
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variable with respect to a parameter in the input matrix and the 
partials with respect to a parameter of the dynamics matrix. Since 
the same singularity exists at each data point, the resulting information 
matrix will also be singular. A second input related problem arises when 
the input is of such a nature that the time histories of several of the 
aircraft response variables appear highly correlated. All the elements 
of the partial derivative of the output vector with respect to any one of 
the parameters will be the same, introducing a singularity. 

4.5.2 Approaches to Identif iability Problems 

Four different approaches have been used to alleviate identif iability 
problems. These are: 

1 . Fixing Parameters - The usual remedy for parameter dependencies 
has often been to fix some of the dependent parameters during iden- 
tification. While this generally improves the numerical convergence 
the choice of a particular parameter to fix and the value at which 
it is fixed are generally not clear. Although it is possible to fix 
the parameters at the wind tunnel or theoretical (DATCOM) values, the 
estimated parameter values may depend upon these fixed values. In 
those instances where the wind tunnel values are inaccurate or DATCOM 
doesn't apply and no other a priori information is available, a better 
way of dealing with the parameter dependency is needed. 

2. A priori Weighting - Whenever a priori values exist for certain parameters 
in a given model structure, they can be included in the maximum likeli- 
hood method by using a Bayesian formulation. If the a priori values have 
a Gaussian distribution, a quadratic term involving the weighted dif- 
ference between the estimated parameter values and the a priori para- 
meter values is added to the likelihood function. Depending on the 
weights given to these differences, it is possible to force any of the 
parameter values to the a priori ones. The a priori values for the 
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aircraft stability and control derivatives are usually derived 
from the wind tunnel estimates and theoretical calculations. 

The weights, which are the most subjective part of this technique, 
signify the confidence in the a priori values. An alternate pro- 
cedure is to successively reduce the weights at each iteration, 
thereby discounting the dependence on a priori information. The 
main advantage of this procedure is numerical since the information 
matrix with a priori weighting is generally better conditioned than 
the one without it. This procedure is a special case of Tychonov 
Regularization used for solving an ill-conditioned set of equations (Ref. 34). 

3. Constrained Optimization - If, from practical or theoretical con- 
siderations, a range of allowable values or relationships between 
the stability and control derivatives can be specified, they can be 
used as constraints on the parameter estimates to avoid non-physical 
estimates. Such a procedure would require a constrained optimization 
technique in lieu of the Newton-Raphson optimization method normally 
used (for the output error criterion) . Including such parameter 
value constraints will most probably also reduce the convergence rate. 

4. Rank -Deficient Solutions - Without any of the above remedies, the 
parameter identify ability problems will usually appear as a difficulty 
with inverting the information matrix and obtaining accurate parameter 
estimates and error covariances. This numerical problem can be related 
to the spread in the eigenvalues of the information matrix. A perfect 
dependency among the parameters should, strictly speaking, result in a 
zero eigenvalue. However, since round-off and other numerical errors 
prevent the matrix from being exactly singular, all the eigenvalues 
will be non-zero with a spread between the smallest and largest eigen- 
value being many orders of magnitude. In such a case, it might be 
better to use a rank deficient solution for the inverse rather than 
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a full rank solution (Ref 35). That is, the inverse to the information 
matrix should be computed leaving out one or more of the smallest 
eigenvalues. Each eigenvalue which is left out relates to a singular 
direction in parameter space and, therefore, indicates a combination 
of parameters which cannot be identified uniquely, (see section 5.3.6) 

The maximum likelihood identification program described below has options 
to use the above methods for solving identif iability problems. Further research 
in this important area is badly needed if identification programs are to be 
used on a routine basis for extracting stability and control derivatives. It 
should be mentioned that two other topics related to identif iability (one of 
which is discussed elsewhere in this report) are those of input design and 
model structure determination. 

4.6 Maximum Likelihood Identification Program 

This section describes the computer program that was designed to 
implement the maximum likelihood identification method for extracting 
stability and control derivatives from flight test data. Three options 
are provided for dealing with the identif iability problem i (1) a priori 
weighting, (2) fixing parameters at a priori values, and (3) rank-deficient 
solution for the information matrix inverse. At the outset of an identi- 
fication run one of these three options is indicated (including the weight- 
ing matrix if a priori weighting is specified) and the program thereafter 
runs automatically. Step size cutting (in the event of a cost increase) 
and parameter bounding routines are always included in the algorithm, al- 
though they can be easily rendered inactive, if it is so desired. 


The flowchart for the maximum likelihood program is shown in Fig. 4.2. 
The principal steps of the algorithm are all blocked out, omitting the 
numerical procedures used to compute such quantities as the solution to 
differential equations, matrix inverses, etc. The following paragraphs 
briefly outline the functions carried in each of the numbered blocks and 
how the logic progresses from block to block. 


38 



( 1 ) 


(3) 



FIGURE 4.2 MAXIMUM LIKELIHOOD PROGRAM FLOWCHART 
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Compute 

(1) gradient vector DJ (equation 4,30), 

(2) information matrix M (equation 4,37) 


ARE ANY PARAMETERS FIXED ? 


Eliminate those rows of the gradient, and those 
rows and columns of the information matrix 
associated with the fixed parameter . 


Compute eigenvalues X^ and eigenvectors 
of M is largest, X N is smallest) 


IS RANK DEFICIENT INVERSE FOR M DESIRED? 


SPECIFY K *= minimum no. of eigenvalues to be retained 


-FOR, I « K, NP 


‘ -X ^ -1 1 

(1) Rank deficient information matrix inverse M = 2. v i v i 

i*l 

(2) Parameter step size, AP « -M * DJ 


(3) Cost associated with new parameter values 
N 

J “ i S « t u>b _ 1 u>v<j> - lo e I B <J> 


+ (P* - p o ) T w( P J - p o ) 


(W = 0 if a priori weighting NOT used) 


FIGURE It. 2 (CONTINUED) 








From amongst (P^My, (P^, AP^), ... ( P *. AP^ ) 
. retain pair with lo west cost: definp ag p*. AP 


For new parameters P , are any parameters constaints 
violated 


(1) if any component of P (previous parameter estimate set) 
was on a constraint and AP took it into infeasible 
region, set that component of A.P 5 0 

(2) compute new P* « p + ^p 

(3) For each component of P* beyond constraint boundaries 
compute ratio 


AP^ - distance beyond boundary 


if none are beyond boundary, go to step (12) 

(A) find smallest DST = DST 

— 1 1 

replace parameter estimates P with new sec of estimates 

P + AP 'DST 


return to step 3 to begin new iteration 


FIGURE 4.2 (CONTINUED) 
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The initialization procedure for the maximum likelihood identification 
method, indicated in blocks (1) and (2), consists of specifying a set of 
a priori parameter values, including the measurement and process noise 
covariances and, if desired, a set of upper and lower bounds for each 
parameter. The observations and input control time histories are then 
read in and stored. Since the maximum likelihood method is a batch pro- 
cessor, it will use the entire data record for each iteration. The ini- 
tialization concludes by specifying which of the several options are to 
be used: (1) fixing parameters, (2) a priori weighting, (3) or rank 

deficient solution for the information matrix inverse. 

With block (3) the first iteration begins. Using the equations given 
in Section A. 3 for the nonlinear system equations, the extended Kalman 
filter, the sensitivity functions and all the required partial derivatives, 
the time history for each of these quantities at each data point is com- 
puted. These differential equations can be solved using any one of a 
number of numerical techniques, e.g., Runge-Kutla. However, the majority 
of the computer time for each iteration will be consumed in solving these 
equations. When there is no process noise, Denery (Ref. 36) has shown that by 
using a transformation some of the sensitivity equations need not be evalu- 
ated, but rather can be expressed as a function of the others. The number 
of differential equations which need to be solved is thereby reduced. 

In block (A) the time histories of the quantities computed in block 
(2) are combined to form the gradient, DJ, and the information matrix, M. 

Up to this point, all the computations can be performed considering a 
very complete and general set of parameters to be identified. This set 
may, however, be more general than needed for a particular application. 

For example, it may not be desirable to identify the rudder derivatives 
if there is no rudder input. If this is the case, the components of 
the information matrix and gradient due to the parameters which are not 
to be identified (thereby being considered fixed) must be removed. Follow- 
ing through the computation of DJ and M, this can easily be done by 
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simply deleting the rows of DJ and the rows and columns of M associated 
with the fixed parameters. This is performed in block (5). 

Many of the computational problems associated with the Newton-Raphson 
optimization technique are involved with the large spread in the eigen- 
values of M. Perhaps the most exact way of computing M -1 is therefore 
to use the eignvalue - eigenvector decomposition. This decomposition is 
performed in block (6). The eigenvalues and eigenvector are also 
required if a rank deficient inverse for M is desired. 

The second program option consists of specifying if a rank dificient 
inverse is to be used, and if so, specifying, in addition, the minimum 
number of eigenvalues which are to be retained in computing M -1 . Note 

that if a rank deficient inverse is not desired, this minimum number 
is just set equal to the total number of parameters. 

The logic for determining the rank deficient M is given in blocks 
(7) - (9). For each rank from the minimum to the full rank, the 
appropriate number of smallest eigenvalues are discarded and the information 
matrix inverse is computed. The associated parameter step is calculated 
and the likelihood function value is determined using the new set of 
parameter estimates. This involves computing the aircraft state and 
observation time histories and deriving the innovation sequence. Note that 
the third option enters in block (7) in the specification of whether a 
nonzero a priori weighting matrix is to be used. 

Blocks (8) and (9) are concerned with the solution where the cost 
determined from the new parameter set is greater than the cost of the 
previous iteration. (If this is the first iteration, the previous cost 
is that associated with the a priori parameter estimates). When the new 
cost is higher, the parameter step size is cut in half and the cost 
reevaluated. If the new cost is still larger than the cost from the 
previous iteration, the step size is cut in half again. This same procedure 
is repeated a given number of times. The reason for this step size 
cutting is the nonquadratic nature of the likelihood surface. 
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be carried out for 


This same step size cutting routine can 
the rank deficient solution procedure. The end result is 
(k) sets of parameter estimates, each one resulting in some value of 
the likelihood function. In block (10), the set of parameter values 
and step sizes associated with the lowest cost is picked out and 
retained. The other sets of parameter values need not be saved. 

The final option of the maximum likelihood program is to alter the 
parameter step size if any of the parameter constraints are violated. 

If this option is not desired, the parameter bounds are simply set to 
very large values. The routine for computing the optimal step size 
without exceeding the parameter constraints involves four calculations. 
The first calculation checks the individual parameter values to see 
which ones are on a constraint. If the parameter step associated 
with any of these parameters results in violation of the constraint 
boundary, that step size is set equal to zero. In the second 
calculation, the new set of parameter estimates are computed, using 
the modified step size (some elements are zero). 

In the third calculation, each component of the new set of 
parameter estimates is compared with the constraint boundaries. For 
any individual parameter value which is beyond the boundary, the 
absolute value of the ratio of the allowable parameter step to the 
actual parameter step is computed. This ratio is exactly the factor 
needed to have that particular parameter value fall on the constraint 
boundary. The smaller that factor, the farther beyond the constraint 
boundary the new parameter estimate would have been. In the last 
calculation, the smallest factor from among those computed for the 
individual parameter estimates is determined and retained. 

* This procedure is based on the Generalized Reduced Gradient method 

of Abadie (Ref .37) . 
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The final block (12) of the algorithm involves multiplying all the 
parameter step sizes by the smallest factor determined in block (11) or 
^ ^ t * ie constrained otimization option was not chosen. If the 

option was used.only one additional parameter estimate will be on the 
constraint boundary. All other parameter estimates besides the ones 
with zero step sizes will be within the constraint boundaries. 

The computation of a new step size for parameters marks the end 
of an iteration. To begin another iteration, these parameter values 
are used in the computations of block (3), and the cycle is restarted. 

The original cost now becomes the cost associated with these new 
parameter estimates. 
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V 


RESULTS OF IDENTIFYING AIRCRAFT STABILITY AND 
CONTROL DERIVATIVES 


This section discusses in detail, the experience and results of applying 
the maximum likelihood identification technique to simulated and real flight test 
data from three different aircraft. Included will be a discussion of the 
problems that were encountered and all the techniques that were used to 
alleviate them. Wherever possible, the cause of the problems is also 
spelled out along with possible implications for flight test orocedures. 


The first data that was used was from a computer simulation of 
X-22 VTOL aircraft. The aircraft model was highly non-linear and the 
data included process noise as well as measurement noise. Experiments were 
run with different input sequences and different measurement noise levels 
to investigate their effects on the parameter estimates. In all, 23 
parameters were identified, excluding the measurement and process noise 
covariances, which were assumed known. 

The second case involved actual flight data from an HL-10 lifting 
body. The digitized data, comprising approximately six and one-half 
seconds of flight, was supplied to SCI by NASA-Edwards FRC. A linear 
aircraft model was assumed in fitting the data and, in all, 20 parameters 
were identified, including the measurement noise covariance and the 
initial flight conditions. The data was assumed not to contain any wind 

gust (process noise) effects. 

The third set of data, also supplied by Edwards FRC, was from an 
M2/F3 lifting body. This data, covering approximately eight seconds of 
the flight test, did contain wind gust effects and represented 
a test of the maximum likelihood technique in reducing flight data which 
had not been successfully reduced by the output error technique. In all 
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22 parameters were identified, including the measurement noise covariance, 
the parameters (time constant and driving noise covariance) of a wind gust 
model and the initial flight conditions. 

5.1 X-22 Simulated Data 

At the time this contract began, the maximum likelihood method had 
been applied to simulated X-22 data, containing the effects of gusts, with 
very promising results. However there remained several important problem 
areas which needed further investigation and improvements to be made to the 
existing program. This section will outline these problem areas, including the 
method of approach, results, and conclusions. 


5.1.1 Generation of X-22 Simulated Data 

The model of the longitudinal motion of the X-22 is given in Appendix 
A. These equations can be put in the nonlinear form 

x = f (x,c,p) + g (x, p) v 

where 

T 

£ = ©> u, w] is the 4-dimensional state vector 

(q« pitch rate, 0= pitch angle, u= longitudinal velocity, *4 vertical 
velocity) 


p is the 23 x 1 vector of unknown parameters (consisting of 
the coefficients of the polynominal expansion in u of the 
derivatives H q , M,, M q , M { , X,, X { . Z<> , Z^, Z { ) 

c is the vector of deterministic control surface deflections and biases 


(1 ' { ea> 


({ 4 . 

es « elevator deflection) 


v is a 3 dimensional white, Gaussian process noise with mean 
0 and covariance Q. 
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The elements of the g matrix are obtained from the matrix of 
first partials || and, therefore, the parameters and states 
appearing in f also appear in g. 

The measurement equations are 



where 

n ^ i s are independent, white, gaussian measurement 

noise samples with the properties E { nj = 0 and E { j^n, } = Rfi tg 


Substituting for u, 4» and however, introduces, process noise 
into the measurement equations. The measurement equation can 
then be rewritten as 



where f 1 and g f are made up of specific rows of f and 
g, respectively. This gives rise to a correlation between the 
process noise and the measurement noise aow consisting of the sum of 


the vector n and 
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With the specification of an elevator deflection sequence. 6 . and 

es 

the process and measurement noise covariances, Q and R, the data, z, could 
be generated (using 4th-order Runga-Kutta integration of the nonlinear equations 

of motion). For each trial approximately 10 secs, of data was used, with a 
sampling rate of 20 per sec. 

5.1.2 Program Description 

The program that was initially used to extract the stability and 
control derivatives from the simulated data consisted of basically two 
parts. The first part was a least squares start-up routine (Ref. 13) 
which generated an initial estimate of the parameter values. This 
least squares technique is an equation error method which, ± n one pass 
through the data, obtains parameters estimates that minimize the 
following criterion 

m * n (*i " Vi.P)) 2 , i=i, . . ,4 

for each derivative x^ which is measured or derived. Since, in the 
X-22 simulation, q, n x and n y were measured, it was first necessary to 
express these quantities as linear functions of the parameters to be identi- 
fied. From Appendix A it is possible to write q as a linear function of 

the parameters (polynomial coefficients) in the derivatives M (u) , M (u) 

o v 7 * w v / * 

M (u) and M g (u); n x as a linear function of the parameters in X (u) 
es o 

V u > an< * X 6 ^ ’ anci n 6 as a l inear function of the parameter's in Z (u) 
es o 

Z w (u) and Z & (u) . Since no parameter appears in more than one expression, 
es 

a unique least squares estimate can be obtained for all the parameters. 

The second part of the program was the maximum likelihood 
identification technique in a form designed to identify the parameters 
of a non-linear model when both the process and measurement noise co- 
variances, Q and R, are known. The least squares parameter estimates were 
used as the initial conditions for the first iteration through the data 
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of the maximum likelihood routine. These estimates are updated with each itera- 
tion, until the algorithm converges. However, since one iteration through 
the data of the maximum likelihood technique required 1 minute of UNIVAC 1108 CPU 
time, only a few iterations were used. 

5.1.3 Limitations of Previous Results 

The first limited trial of the maximum likelihood identification 
technique applied to the problem of extracting aircraft stability and 
control derivatives was on simulated X-22 VTOL data supplied to SCI by 
Cornell Aeronautical Laboratory. The complete data set consisted of four 
cases; two without process noise and two with process noise. In each case 
a single step input was used to generate the data. For the no process noise 
cases 2A and 2C, satisfactory estimates were obtained for all parameters except for 
X and Z 6 derivatives. For both the low process noise (2B) and moderate process 
noise (2D) cases, the errors in all the identified parameters were much larger. 
However, when a multistep input sequence, which supplied much more excitation, 
was used, the results for the process noise case improved greatly. It soon 
became apparent that the quality of the parameter estimates were very "input” 

dependent. 

As outlined in Section 4.3 , the calculation of the update in the parameter 

estimates involves the computation of the gradient 3J, and the information matrix 

3p 

3^J, where J is the likelihood function (Eq.4.29). These quantities in turn in- 
volve solving a differential equation for the sensitivity matrix 3x (i/i-1), 

where ic (i/i-1) is the output of a Kalman filter. Since the equation for the 
state esti ma te error covariance, P (i/i-1), does not reach steady state and 
involves the unknown parameters, the partial derivatives of P (i/i-1), and the 
Kalman gain W^, with respect to p should be included in the computation of 
3J and 3 2 J . (See Appendix B) These were neglected in the earlier X-22 
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identification work. It was orginally thonght that the lack of monotonic 
convergence of the identification algorithm could be attributed to these 
extra partials being neglected in the computation of the partial derivatives 


As noted earlier the accelerometer measurements introduce process noise 
into the measurement equation, thereby correlating the total effective 
measurement noise with the process noise. This correlation which effects 
the equations of the Kalman filter and, therefore, the computation of the 
sensitivity matrix, was not accounted for in the earlier application. There 
is an additiona! correlation between the g(.) function in the dynamics equation 
*" t e g (.) function in the measurement equation since both are a function 
of the state estimate, x (i/1-!) . It was originally thought that this might 
also.have a significant effect on the parameter estimates and on the standard 
deviations supplied by the Craraer-Rao lower bound. 


In both the data supplied by Cornell and generated at SCI the process 
noise and control were kept constant over an integration step. It was important 
to distinguish the cases where the process noise changed value before or 
after a measurement. In one case there would be a correlation between the 
measurement noise at a sampling point and the process noise during the pre- 
ceeding integration step while in the other case the ocrrelation would be with 
the succeeding integration step. This difference, though subtle, is important. 

All these areas were investigated with the objective of determining the 
effects, on the parameter estimates and standard deviations, of different 
modifications. The following description of the results is broken up into 
separate sections, each one involving a different area of investigation. 

Included in an accompanying table are the parameter estimates and standard 
deviations resulting from each change in the algorithm. These standard de- 
viations are actually lower bounds on the actual values and are obtained from 
the diagonal elements of the matrix / 9 2 j\ -1. Also noted, in each case, is 

\3P 2 / 

the number of iterations of the algorithm used in obtaining the results. 
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5.1.4 Comparison of Results with Single Step and Mu lti-Step Input Sequence s 


The initial processing of the data supplied to SCI by Cornel Aero- 
nautical Labs resulted in unsatisfactory parameter estimates both for 
the low and moderate process noise cases, as shown in Columns 1 and 2 
of Table 5.1. Since it was already known that the input sequence shown 
in Fig. 5.1, used to generate the Cornell data did not sufficiently 



FIGURE 5.1 INPUT SEQUENCE USED IN GENERATING CORNELL DATA 


excite all the modes of the system to allow adequate identification and 
since there was considerable uncertainty already as to how this data 
was generated, SCI programmed its own data generator using the equations 
of motion in Appendix A. The elevator deflection sequence used by SCI 
in generating the X-22 simulated flight data is shown in Fig. 5.2 and 
the process and measurement noise standard deviations are given in 
Table 5.2. 
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TABLE 5.1 X-22 IDENTIFICATION RESULTS 


1 t Casa 2 
After 
Initial 
J Changes 

— - 

D 2: Casa 2 
After 
Initial 
Chance* 
3.37i5 

B 3: Case 2 
Using SCI 
Mult latop 

3.5388 

B 4: Case 21 
With 
Forward 
Correlatlo 
3.5499 

3 i Case 21 
With 
Constant 
n 

3.3355 

1 Casa 28 

With 

Constant g 
and Corralatioa 

138.3 

72.46 

28.04 

30.96 

31.47 

— 3L 517 

32.02 

-1.632 

-.835 

-.159 

-.206 

-.209 

-.22* 

355 .00429 

.00211 

-.000441 

-.000284 

-.000292 

4.000174 

-1.248 

-.471 

-.0947 

-.137 

-.148 

-.166 

17 .00558 

.000009 

-.00321 

-.00274 

-.00268 

-.00235 

-1.0721 

-.995 

-.489 

-.520 

-.500 

-.436 

03 .00434 

.00278 

-.00116 

-.000814 

-.000973 

-.00182 

-11.85 

4.45 

18.3a 

18.17 

18.15 

18.11 

.311 

.179 

.0708 

.0734 

.0733 

.0758 

35.80 

23.01 

18.88 

19.41 

17.87 

17.97 

-.352 

-.160 

-.103 

-.110 

-.0849 

-.0874 

.000659 

-.0000539 

-.00024 

-.000228 

-.00033 

-.000316 

.0162 

.17* 

.220 

.2065 

.2233 

-.223 

9 -.000051 

-.00122 

-.00159 

-.00143 

-.00159 

-.00160 

8.586 

-.976 

-.691 

-1.011 

-.896 

-.875 

-.0521 

.0202 

.0171 

.0212 

.0199 

.0196 

IBSm 

-31.22 

-35.15 

-27.08 

-30.19 

-28.03 

IKarJ 

.511 

.969 

.825 

.879 

.813 

-•0037 

.00568 

-.00728 

-.00669 

-.00692 

-.0066 

-1.320 

-.599 

-.272 

-.361 

-.3397 

-.356 

7 .00516 

-.000399 

-.00314 

-.00213 

-.00233 

-.00213 

20.6 

-11.26 

-.673 

-1.143 

-1.136 

-1.11 

-.122 

.106 

.0200 

.0269 

.02697 

4.0271 

*• 3 

2 

2 

2 

2 

2 




.5794 

.9158 

.9061 

76.18 

16.47 

2.150 

2.249 

2.583 

3.501 

1.104 

.254 

.0352 

.0371 

.0418 

.0573 

.00405 

.000996 

.000145 

.000154 

.000170 

.000235 

.695 

.138 

.0251 

.0259 

.0299 

.0399 

.00581 
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.000244 
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.467 

.183 

.0139 

.0146 

.0144 

.0157 

.00371 

.00135 

.000152 
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100.4 

18.64 

.279 

.288 

.3339 

.3368 

.7792 

.145 
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.00315 

.00361 
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20.35 

4.93 

.873 

.873 
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1.010 

.274 

.0688 
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.000940 
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.779 
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.0749 

.00657 

.00142 

.000321 

.000330 

.000507 

.000696 
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TABLE 5.1 CONTINUED 


/ . Cam 21 
With 

toM Added 
ItftUll 

J.5767 

g j Cue 2D 
With 

Constant g 
No Added 

fattlals 

3.51684 

$ t Cam 2D 
With 

Constant g 
end Add 
Partial 
3.6949 

)3.» 

47.291 

76.69 

-.261 

-.427 

-1.049 

. .0000359 

+. 000295 

.00345 

-.157 

-.421 

-.583 

-.00247 

.0000073 

.00193 

-.502 

-.5088 

-.321 

-.00102 

-.000826 

-.00345 

It. 04 

15.92 

17.79 

.0752 

.1003 

.0807 

19.17 

16.349 

13.50 

- -.109 

-.0593 

-.0359 

-.000221 

-.000449 

-.000449 

.215 

.228 

.2764 

-.00150 

-.00159 

-.00191 

-.952 

-1.402 

-.661 

.0207 

.0261 

.0181 

-25. BS 

-14.18 

29.94 

•796 

.636 

-.179 

-.00656 

-.00619 

-.00245 

-.569 

-.634 

-.9659 

-.00196 

.000741 

+.00389 

-1.273 

-4.618 

-1.146 

•02B9 

.0703 

.0278 

2 

2 

2 

.5829 

22.8934 

14.06 
.2289 
.000934 
.1628 
.00151 
.0760 
.000824 
1.706 
.0181 
4.270 
.0688 
.000277 
.0498 
.000462 
.544 . 
.00581 
25.27 
.406 
.001131 
.2953 
.00273 
3.231 
.03443 

23.167 
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FIGURE 5.2 MULTISTEP INPUT 


TABLE 5.2 STANDARD DEVIATION OF PROCESS AND MEASUREMENT NOISE 


Standard Deviation 


gusts (process noise) 


measurement noise 


Low 

Moderate 

1,0 fps 

5.0 fps 

1.0 fps 

5.0 fps 

.2 deg/ sec 

1.0 deg/ 

0.5 fps 

2.5 fps 

0.075 fps 

.375 fps 

.03 deg 

.15 deg 

.01 deg/sec 

.05 deg/ 

001 g 

.005 g 

005 g 

.025 g 

2 

>025 deg/sec 

.0125 deg/ 
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The initial state estimates used in the Kalman filter were set equal 
to the state measurements observed at the first data point, i.e., 

£ (i/i) ** Since these measurements consist of the true state 

£ (1) plus noise, the error covariance P (1/1) is given directly as 
P ± (l/l) = E {( x (1) = ft (1/1) ) <x (1) - ft (1/1) ) T ) ■ dia 8 ^ R n» R 22’ 

^33* ^4A ^ where R ^i is t ' ie 1^ diagonal element of the measurement noise 
covariance matrix, R. 


The effect of using the SCI data generation program with a multistep 
input sequence instead of a single step input sequence can be seen by 
comparing Cols. 2 and 3 of Table 5.1. The parameter estimates are greatly 
improved and the standard deviations are reduced. This enhanced ability to 
identify the parameters is attributed to the fact that the more varied the 
input sequence, the more the system modes are excited and the higher is the 
signal- to-noise ratio at the output. 

It is important to realize that the results as shown were obtained 
for only one noise sequence, and therefore, although the parameter esti- 
mates improved considerably, the parameter estimates by themselves are 
not sufficient for comparison. Neither are the costs, themselves, 
since changes in the noise sequence will influence the costs. 

As will be seen in the later areas of investigation, some of the para- 
meter estimates improved as the result of some change in the algorithm and some 
did not. This is almost always the case, and unless the individual relative 
effect of the parameters on the cost is known, it is very difficult to say 
on the basis of only a few of the parameter estimates improving, that the 
algorithm itself is improved by any change. The criterion that is more 
suitable for comparison is the standard deviation of the parameter estimates. 

A lower bound on the standard deviations is obtained from the inverse of the in- 
formation matrix and this is adequate in many cases. However, if the differences 
in standard deviations are small, the Cramer-Rao lower bound may not reflect 
these differences. 
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5 • 1 • 5 Comparison of Forward and Backward Correlat ion 


The next area of investigation involved the effects on the parameter 
estimates and standard deviations of the type of correlation between the 
input and output noise sequences* The original Cornell data specified that 
the process noise was kept constant over an integration step. This meant 
that whatever correlation existed at the measurement times (note that the 
measurements are taken at discrete instants) also existed throughout the 
entire integration interval* The SCI multistep data was first generated with 
the accelerations q, and n^ being calculated using the process noise from 
the previous integration step and the control, which was also held constant 
over an integration step, from the next integration step. Figure 5.3 below 
graphically shows when the values of v^ the process noise, and 6 , the 
control, were changed in relation to the measurement instances. 



Process Noise 


Control 


Measurements 


FIGURE 5.3 v AND 6 ORIGINALLY USED IN THE CALCULATION OF q n n 
(DOTS INDICATE THE INTEGRATION TIME POINTS) * x * y 
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This seemed inconsistent with the way correlation between process and 
measurement noise is usually represented in state space models. For example 
the discrete analog of the continuous time representation 


x(t) * Fx(t) + G v(t) 
y(t) = Hx(t) + n(t) 


is 


Vi 


= $ 


*k 


+ rv. 


'k 


H 




where 
model , v. 


(t k’ C k+1 


and G,r are related by T, the sampling interval, 
and n k are correlated, i.e. the process noise durin 
is correlated with the measurement noise at t fc . The 


In this 


correlation between n^ 
is calculated using 
means that the correlation between 


and v k effects W 
(the control at time 
v (t) and 


■V* 

n. 


not Xj^. Similarly , 

In continuous time this 
must be during the 


interval between 
time 

IS. 

and t. . Therefore, 


and 


n 


k+1 
us 

and n 


In addition, the derivative x(t) at 


t must be calculated using the control that existed between t^ 
k 


should be calculated using the 

euu i » *" x — - y 

process noise that will exist in the next integration step and the contro 

that existed in the previous integration step, as in Fig. 5.4. 


J 


'i+1 




i+2 


Process Noise 


'i+1 


3 i+2 


Control 


FIGURE 5.4 v and 6 USED IN CALCULATING OF q, n , n AFTER CHANGE 

a y 
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Note that the correlation between measurement and process noise has changed 
from a backward correlation to a forward correlation, i.e. the measurement 
noise at the sampling instant is now correlated with the process noise during 
the succeeding integration interval. This forward correlation, if unmodeled 
with effect the (forward) Kalman filter operation while the backward correlation 
will effect a second (backwards), smoothing run through the data. In an actual 
application with continuous time dynamics and measurements, this problem will not 
arise. However, if discrete measurements are recorded, the type of correlation that 
exists will be an issue, and for data generated by a physical system the forward 
correlation is the more natural. 


The effect of using the forward correlation in the data generation 
and then identifying the parameters, although not accounting for this forward 
correlation in the Kalman filter, is shown in Cols. 3 and 4 of Table 5.1 (for 
the low process noise case, 2B) . Almost all the parameter estimates have 
degraded, offset by an accompanying slight increase in standard deviation. 
These results indicate that if forward correlation (the type normally used in 
computer models of discrete systems) is not modeled, it can have a detrimental 
effect on the quality of the parameter estimates. 


5.1.6 Additional Performance Index 

In Col. 4 of Table 5.1a new performance measure is introduced, 
labelled J’. This represents the unweighted mean square error in 
estimating the output, given by 

1 N 

J ' = 2N J^'i " h(x i/i-l’ [z i " h( *i/i-l» 

J' was introduced as another means of comparing the results of the 
different runs and differs from J by the fact that changes in the Kalman 
filter only effect it through the state estimates It was not 

substituted for J in the identification algorithm since it does not weight 
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the various state estimates and is not the likelihood function. As was 
pointed out earlier, the weighting matrix used in J was a function of 
the Kalman filter covariance and therefore varied as different changes 
or additions were made in the filter. 

Experience with the algorithm and the two costs, J and J* , indicated 
that J' was more sensitive to the parameter estimates than J, which 
always appeared to be in the range 3.5 to 4. However, this is to be 
expected since, if the parameter estimates are bad, the state estimates 
will likewise be bad and the associated state estimate error covariances 
will be large. The weighted residuals, which are inversely proportional 
to the state estimate error covariance, may therefore change very little, 
j' , on the other hand, has no weighting, and therefore reflects the 
absolute accuracy of the state estimates. 

5.1.7 Accounting for Correlation Between Process and Measurement Nolse^ 

The initial attempts to account for the correlation between the 
measurement and process noise were not successful either in reducing 
the cost J or in improving the parameter estimates. It was decided that 
part of this problem was due to the fact that the noise term gCx^)^ 
appearing both in the system equations and the measurement equations, depends 
on the state x^ and gives a long-term correlation. For this reason, it was 
decided to alter the system equations to include a constant g matrix, 
calculated from the initial control values, the nominal state values and 
the actual parameter values. 

The identification program was first run without accounting for 
the process noise-measurement noise correlation, in order to get a 
new standard for comparison. As shown in Column 5 of Table 5.1, 
many of the parameter estimates were worsened and the J' cost increased 
slightly. The fact that the J cost decreased slightly can be attributed 
to the fact that the weighting matrix is a function of the g matrix. 

Also the worsened parameter estimates were not, in all cases, offset by 
increased standard deviations. 
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The improved accuracy of the parameter estimates with the non- 
constant g matrix can be attributed to the fact that the parameters 
of the g matrix account for the system gain factors and are therefore 
easily identifiable. Since, in the original system model, the same 
parameters appeared in the f and g matrices, the parameter estimates 
were overall improved. 

It was agreed that, although the constant g assumption was a 
large change from the original problem, it did not represent a 
departure from reality. As can be seen from the system equations, the 
g matrix was originally constructed from the linearized f (*,*) 
matrix, the motivation being that the process noise would then 
enter the dynamical equation linearly. 


With the constant g assumption, the correlation between process and 
measurement^ noise was accounted for by adding the following terms (indicated 
by | t0 the ident ification algorithm. ( See Section 4.3) 

Defining S ± * E{vJ n ± } to be the measurement noise and process noise 
correlation at the i measurement time: 


Kalman filter state prediction: 


f(x) 


I 

+ 1 

!. 


g s 


i-1 


R -1 [z 


i-1 




I 

I 

I 


Kalman filter covariance prediction: 


P i/i-l “ Vl P i-l/k-l *1-1 + 8 Q S T (AT) 2 

-1 T T o I 
1 " g S i-1 R S i-1 & < AT > I 
I 1 
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9f 
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3x 9h 
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9x 

3p 3p 
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■J 


x=x i-l/i-l x= *i-l/i-l 


Sensitivy equations: 


_d 

dt 


3x 

3p 


3f 3x 3f 

3x 3p 3p 


- I 


_ _ , . , 3x jHi 

g S i-1 [_ 3x | 3p 3p 


r _i r 


'i-1 


< t < t. 


x=x 


i-l/i-1 


The resulting parameter estimates and cost are given in Col. 6 of 
Table 5.1. Comparing these results with those of Col. 5 it is seen 
that, while some parameter estimates improved, others did not. The overall 
cost J remained the same, while J’ decreased slightly. An important 
point is that the standard deviations for almost all the parameters increased. 
This implies that, with the initial set of least squares parameter estimates 
as good as they are, the inclusion of the terms accounting for the input/output 
noise correlation does not gain much by way of the parameter estimates. 

However, with these correlation terms included, the standard deviations 
come out to be more realistic, in view of the differences between the actual 
and estimated parameter values. All this is not to say, however, that for 
a less exact set of initial parameter estimates, the correlation terms won’t 
improve the algorithm performance. 


5,1.8 Inclusion of Additional Partial Derivatives 

There were two principle motives for adding the additional first order 
partial derivative terms of the state covariance matrix to the identification 
algorithm. The first was that they would be required if Q and R were to be 
identified, since they both appear explicitly in the equations for the state 
estimate error covariance. The second motive was that the cost J, instead 
of mono tonically decreasing with each iteration, was oscillating. 
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A possible cause of this was that the gradient direction was being 
calculated incorrectly due to the fact that the neglected partial 
derivatives have an appreciable effect on the gradient of the likelihood 
function in the vicinity of the minimizing set of parameter estimates.* 

It was necessary, therefore, to investigate the importance of these 
extra partial terms in identifying the system paramters. 

The inclusion of the additional first order partial derivatives into 
the identification algorithm presented special problems due to the nature 
of the X-22 model. Since the g matrix is a function of both the states 
and of the parameters, the derivative of the g Q g tern (appearing in 
the covariance equations) with respect to p must be included. This is a 
particularly lengthy computation. A first attempt to include all the 
additional partials except those of the g Q g term is shown in Col. 7 of 
Table 5.1. Comparing this with Col. 4 (since constant g was not assumed), it 
is seen that both costs J and J 1 increased slightly and some of the parameter 
estimates, themselves, are slightly degraded. 

Two changes were then decided upon. The first was that case 2D 

(moderate process noise) would be used instead of 2B (low process noise). 

The second was that the g would again be modeled as constant. 

The first change was motivated by the desire to see larger variations 

in the costs. More process noise would make the initial least squares 

parameter estimates worse and therefore the effect of the additional 

partials would, potentially, be the greatest. The second change was 

motivated by the fact that with the constant g assumption, partials 
T 

of the g Q g terms are identically zero. 


* A second possible cause was too large a step size, which was corrected 
by halving the step size along the calculated gradient direction whenever 
the cost J, increased. 
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Column 8 of Table 5.1 gives the parameter estimates, cost and standard 
deviations for the 2D case without the added partial terms, and with the 
constant g assumptions. Column 9 gives the same results with the 
additional partials included in the algorithm. Once again the cost, J, 
increased slightly ( % 5%). This can be attributed to the fact that the 
convergence of the algorithm with the added partial terms may be slower. 

The standard deviations of these parameter estimates (for both Cols. 7 and 9 ) 

are not given since their calculation requires another full iteration of the 
algorithm. Since the added partials quadruple the run time per iteration, 
it was decided not to calculate these values. The important point is that, 
considering the slight variation in J, even for this worst case, and the 
possible benefits of the added partials in terms of the vastly increased 
run time, it is not necessary to include these added partials in the 
identification algorithm. 

5.1.9 Aerodynamic Derivative Estimates 

As was noted at the beginning of this section, the aerodynamic deri- 
vatives themselves were not identified* Rather, the coeffients of first 
or second order polynomial expansions in the longitudinal velocity, u, of 
these derivatives were identified. Using these identified coefficients, it 
was then possible to reconstruct the time histories of the total derivatives, 
and compare these estimates with the actual values. These comparisons are 
shown in Fig. 5.5 for the 2B data and the original model structure. The 
fits to most of the derivatives was good. This indicated that, although some 
of the estimates of the polynomial coefficients had relatively large uncer- 
tainties , their influence on determining the total derivative behavior was 
small. 
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FIGURE 5.5 (CONT.) X-22 ESTIMATED AND ACTUAL (SIMULATED) STABILITY AND CONTROL DERIVATIVE TIME HISTORIES 





5.2 HL-10 Flight Test Data 


This data was used mainly for checking and validating the maximum 
likelihood program. Flight data for the HL-10 lifting body was supplied 
to SCI by NASA FRC (Flight Research Center) Edwards along with as much 
information about the flight condition as was available at the time. FRC 
also supplied SCI the results of their stability and control derivative 
extraction program along with the specific measurement and a priori parameter 
weights that were used. The HL-10 data did not contain gust effects and a 
linear model for the lateral dynamics was used. 

It should be noted for the case of unknown measurement and process 

noise covariances an additional term, Nfcn (detCHP^HV R)) is 

added to the cost criterion, where P oo is the steady-state Kalman filter 

s s 

error covariance matrix. This is because the weighted mean square error 
^ill always have a constant value of * (no. of states) since the weighting 
matrix is the sample covariance. 

5.2.1 Dynamical Equations of Motion and Observation Equations 

The linearized lateral equations of motion, including the effect of 
the wind gusts, are: 
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n » 



6 u r 


where p is roll rate (°/sec)* 

r is yaw rate (°/sec) 

3 is sideslip angle (°) 

4> is roll angle (°) 

6 is aileron deflection (°) 

6 is rudder deflection (°) 

is wind gust in equivalent sideslip angle (°) 
is a transformation matrix 



All the quantities are in the body axes system^ 
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For both the HL-10 and the M2/F3 flight data, (which is discussed 
in Section 5.3), a , the angle of attack, y, the flight path angle, 
and V, the velocity, were assumed constant over the data record and 
their values were supplied. The same was true of 1^, I x and I z * In 
most cases, Y and Y were assumed to be zero, which implied that there 
were nominally 20 parameters to identify (7 in F, 9 in G and 4 
initial conditions), excluding the wind gusts and biases in measurements. 


The observation equations, again assuming the existence of gusts, 
are given below: 



(5.2) 


where y^ is the lateral acceleration 

n^, i =* 5 are independent white noise Gaussian measurement errors. 
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Notice that B n always appears with B when gusts are assumed to exist. 


In the processing of the HL— 10 data, B^ = 0, thereby leaving the 
measurement noise, n, as the only random noise source. The Kalman filter 
equations reduce to the original state equations and the Maximum Likelihood 
Method is essentially similar to the Output Error Method with the exception 
of the weighting matrix. 

5 -?*2 Characteristics of HL-10, Flight 19-2 

The HL-10 data was assumed to contain no gusts and was, therefore, 
processed by using the generalized output error criterion. The results, 
however, were somewhat unexpected and, in all, eight different runs, each 
with a variation on the original output error run, were made to solve the 
problems which were encountered. It was later learned that these same 
problems had been encountered by FRC. Each of the different runs are 

described in this section, along with the objectives, observations and 
conclusions particular to each. 

Along with the observations and control time histories (see Fig. 5.6) 
wind tunnel derived parameter estimates were supplied to SCI by .FRC. There 
were 327 data points at a sampling rate of 50 per second, for a total 
of 6.54 seconds of data. The angle of attack of this flight was 16.8° 
and the mach number was 1.22. 

It was evident from the data that a substantial amount of clipping 
and quantization had occurred during data collection. An accurate model 
for the observations would include the dynamics of the instrumentation 
system, but no information was available. One effect of not including a 
model of the instrumentation dynamics and quantization effects would be 
to have correlated nongaussian residuals, in each of the observations. 
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FIGURE 5.6 HL-10 OBSERVED DATA AND CONTROL SEQUENCE TIME HISTORIES 
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5.2.3 Results of Flight 19-2 


The first processing of the HL-10 data was with the maximum likelihood 
identification program in an output error mode. As such, it differed from 
the FRC Newton-Raphson identification program in only one way, the weight- 
ings of the fit errors. FRC uses a constant weighting matrix, based on 
an idea of the instrumentation accuracies. The SCI approach estimates 
the measurement noise covariance matrix and weights the fit errors by 
the inverse of this matrix. It is shown in Section 4.4 that the ML estimate 
of the measurement noise covariance matrix is given by the sample covar- 
iance of the observations. Edwards also uses a priori weighting on the 
difference between the parameter estimates and the wind tunnel estimates, 
which was not initially included in the SCI maximum likelihood program. 

The time histories of the five response variables along with the 
estimated values obtained after 11 iterations of the data are given in 
Figure 5.7. Included also in Figure 5.8 is the fit error in the p and r 
observations. A comparison of the parameter values themselves and the wind 
tunnel values is given in Table 5.3 along with the associates standard devia- 
tions in the parameter estimates. The parameter values obtained from the 
FRC output error method with fixed weights after 7 iterations and a priori 
weighting are also shown in Table 5.3, along with the associated confidence 
bounds. 

As Figure 5.7 shows, the fits in all the observations were very good. 

However, as is often the case, the fit error alone does not indicate an 

acceptable set of parameter values. The two major problems that appeared 

were that (1) the signs of the L , L , N and N r derivatives had all 

* * 

changed from those of the wind tunnel values and (2) the fit error in 

* In this investigation, it was assumed that the wind-tunnel and theoretical 
values had correct signs. This may not necessarily be the case for lifting 
bodies flown at transonic speeds due to limitations of wind-tunnel testing 
and theoretical calculations. The question of how much confidence can really 
be placed in these values has not been resolved. 
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TABLE 5.3 HL-10 PARAMETER ESTIMATES AND STANDARD DEVIATIONS 


Parameter 

Wind 
Tunnel 
and Theo- 

FRC Values with 
a priori weighting 
(with confidence 
bounds) 

Max. Lik. Estimates 
(with St'd. dev.) 

Max. Llk. Estimates 
with Y and Y r 
(with St a dev.) 

Max. Lik. Estimates 
With a priori Weighting 
(with St’d dev.) 

Max. Lik. Estimates 
With a priori weighting and Bias 
(with Sc*d dev.) 

L 

P 

-0.3435 

-0.3436 (0.196) 

0.915 

(0.025) 

0.395 

(0.022) 

-0.295 

(0.0114) 

-0.271 (0.0114) 

L 

r 

0.2723 

1.188 (0.0196) 

-1.363 

(0.138) 

0.0671 

(0.111) 

1.574 

(0.0728) 

1 ’ 349 (0.0747) 

L s 

-30/75 

-52.073. (1.18) 

-56.489 

(0.305) 

-46.94 

(0.417) 

-47.179 

(0.285) 

-50.32 (0.330) 

N 

P 

0.0245 

0.0326 (0.0157) 

-0.160 

(0.00429) 

-0.187 

(0.00370) 

0.0380 

(0.00352) 

0.0550 (0. QJD323) 

N 

r 

-0.1290 

-0.1114 (0.0157) 

0.432 

(0.0187) 

0.548 

(0.0152) 

-0.111 

(0.0114) 

-0.0896 (O.Olie) 

N s 

6.8411 

7.0496 (0.118) 

8.523 

(0.719) 

7.474 

(0.0978) 

6.292 

(0.0574) 

6.845 (0.0570) 

Y 

P 

-0.0617 

-0.0584 (0.00392) 

— 


0.329* 

(0.00307) 

0.335* 

(0.00192) 

0.310 (0.00192) 

Y 

r 

-0.0120 

-0.0122 (0) 

— 


-1.091* 

(0.0122) 

-1.064* 

(0.00754) 

-1.019 (0.00762) 

y e 

-0.0916 

-0.0855 (0.00392) 

-1.471 

(0,0202) 

-0.1458 

(0.0161) 

-0.0949 

(0.00383) 

-0.0918 (0.00382) 

L <s« 

11.2464 

11.996 (1.18) 

12.415 

(0.0731) 

12.494 

(0.0775) 

12.124 

(0.0586) 

12 - 2 82 (0.631) 

L 6« 

5.665 

5.877 (1.18) 

6.288 

(0.173) 

6.544 

(0.148) 

6.252 

(0.111) 

6 -429 (0.114) 

L 

o 


- 

21.03 

(0.302) 

17.341 

(0.222) 

0.247 

(0.117) 

-0.0137 (0.0951) 

N. 

6 s 

0.8135 

1.456 (0.118) 

1.262 

(0.0136) 

1.245 

(0.0201) 

1.404 

(0.0131) 

1 * 357 (0.123) 

N . 
6r 

-3.617 

-3.178 (0.118) 

-3.186 

(0.033) 

-3.313 

(0.0326) 

-3.257 

(0.0229) 

-3-1 94 (0.0212) 

N 

o 

— 

- - 

-1.633 

(0.0479) 

-1.206 

(0.0401) 

1.561 

(0.0205) 

(0.0163) 

Y 6o 

-0.00180 

-0.0018 (0) 

0.0623 

(0.00390) 

.0515 

(0.00373 

i -0.0231 

(0.00435) 

-0.0482 (0.00443) 

Y 6r 

0.0111 

-0.00427 (0) 

0.919 

(0.00992) 

0.0629 

(0.00895) 

! 0.0717 

(0.00782) 

°*519 (0.00768) 

Y 

0 



0.513 

(0.00898) 

0.412 

(0.00778) 

, -0.751 

(0.0112) 

-°‘ 63 9 (0.00829) 

^initial 



1.76 

(0.158) 

3.325 

(0.146) 

i 2.225 

(0.116) 

2 -348 (0.117) 

R initial 



0.117 

(0.322) 

0.0251 

(0.0263) 

j -0.710 

(ol* 021 ) 

-0.599 (0.0201) 

^initial 



0.398 

(0.0188) 

0.1305 

(0.0152) 

; -0.226 

(0.0125) 

219 (0.0111) 

^initial 



1.939 

(0.0697) 

1.767 

(0.0511) 

-1.748 

(0.0663) 

-1.429 (0.0571) 

Likelihood Function Value 

-2243 

-2359 

-1552 

-2264 I 


A 

The identified quantities sre Y p + sin » , Y^- cos o 

















Fig. 5.8 HL-10 FIT ERRORS IN p AND r MEASUREMENTS - OUTPUT 

ERROR 
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the observations (although only p and r are shown) exhibited a sinusoidal 
characteristic* Although the precise reasons for these problems are not 
known, there are several contributory factors. 

The first factor is that the parameters with the opposite signs 
are weak parameters that is, they cannot be accurately identifed from 
flight test data. This is the approach FRC has taken with the L r and 
Np parameters and in the a priori weighting, they weighed their wind 
tunnel values very strongly* There is also an id entif lability problem 
because the stability augmentation system (SAS) was used on this flight. 
Such a system would tend to suppress certain modes of the system, while 
emphasizing others. The parameters of the suppressed mode are, therefore, 
hard to identify* 

A second factor could be that the linearized dynamics are not accurate 
enough for the flight conditions of this data* Also, there may be coupling 
between the longitudinal and lateral modes, which is not included in the 
model. 

A third factor, which may account for the sinusoidal characteristic 
in the fit error, is that due to the instrumentation dynamics, the measure- 
ment noise is actually correlated. This hypothesis could be verified by 
reprocessing the data with the measurement noise modeled by a second order 
linear system. Final verification of these possibilities would have to be 
based on processing additional data under similar flight conditions. 
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In many instances, the solution to such problems is to adjoin to the 
likelihood function other measures of performance, usually indicating some 
a priori knowledge of the parameter values. Several examples of these 
are the a priori weighting and constrained parameter values which are 
discussed later. A less direct approach, of reducing the dimensionality 
of the parameter space, is also presented. 

5.2.4 Ou tput-Error with Y p and Identified 

The remaining series of runs were all aimed at solving the problems 

encountered with the first processing of the HL-10 data. First, the Y p 

and Y derivatives were considered as two additional parameters to be 
r 

identified. 

In the previous run, both Y p and Y r were considered zero. However, 
by examining the equations of motion, p can be expressed as a function of 
(Y + sin a) and r can be expressed as a function of (Y f - cos a)r. This 
would Introduce previously neglected second order effects into the estimate 
of p and r, and possibly account for the sinusoidal characteristic in the 
fit error. 

The results indicated that the fit in each of the observations was 
about the same as in the straight output error case although, as shown 
in Table 5.2, L r does have the same sign as the wind tunnel value and L p is 
less positive. However, on the other hand, both N p and are worse. 

In addition, the sinusoidal characteristic of the fit error remained, 
diminished only slightly. 

5.2.5 Output Error With Constrained Parameter Values 

Since it seemed clear that the opposite signs on the four parameters 
yere a result of trying to minimize the fit error, and until additional 
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terms were added to the model , these signs were likely to remain opposite, 
the next run constrained the values of and to remain negative. This 
would answer the questions of whether there was a set of parameter values 
which would minimize the cost criterion (although not globally) with the 
indicated parameters having the same sign as the wind tunnel values. If 
these values remained on the constraints, no such minimizing set exists. 

The results of this run were that the and values remained on 
the constraints and the and values again had the opposite signs. In 
addition, the fit in the observations was drastically degraded. Only the 
fit on r was of equal quality as in the two previous trials. 

5*2.6 Output Error With Different Initial Conditions 

One remaining possible cause of the changed values could be the 

presence of local minima having the opposite signs on L , L , N and N . 

P r p r 

The next run used initial parameter estimates of L p and which were 
more negative than the wind tunnel values, while keeping the control 
derivatives the same. The results of this run were that the signs of all 
four parameters (L^, , B^, N^) were again reversed and, if more itera- 

tions had been performed, the final parameter values would have, more 
than likely, been equal to the initial set of output error values. 

Although the results from many sets of Initial parameter values would 
be necessary to conclusively determine if the values from the initial 
output were truly the global minimum, it appears that this might be the 
case. If the signs on the four parameters are to be the same as the wind 
tunnel values, an additional cost must be put on the difference between 
the parameter estimates and the wind tunnel values. This is precisely the 
reason for the n a priori weighting” mentioned earlier. 

5.2.7 Output Error With A Priori Weighting 

The values for the parameter weights used in this run were obtained 
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directly from the FRC’s runs supplied to SCI. Figure 5.9 shows the time 
histories of the observations and the resulting estimates (except for <J>) 
for this weighting. It is clear that the fits to the observations, except 
for p, have been degraded. However, as shown in Table 5.3, the values 
of the four indicated parameters have the same sign as the wind tunnel 
values . 

It was found, however, that except for p and r there was appreciable 
bias in the fits to the observed data. Not accounting for this bias 
in the computation of the measurement noise covariance will cause 
incorrect weights to be assinged to the different observation residuals. 

This will effect both the computation of the gradient and the information 
matrix, resulting in incorrect parameter step sizes. Another run was 
made with the sample bias of the observation residuals computed, at each 
iteration, and accounted for in the sample covariance calculation. The 
fits to the observed data for this second processing of the data with a 
priori weighting are shown in Figure 5.10. The fits to r, 3 and <f> are much 
improved over the previous case. Only the fit to the lateral acceleration 
data has not improved. As shown in Table 5.3, many of the parameter 
estimates for the run are closer to the original wind tunnel values than 
for the previous run, without considering the biases. It, therefore, 
appears that when using a priori weighting, consideration must be given 
to the possibility of having biased residuals, which must be used in 
computing the sample covariance. 

A final processing of the data with a priori weighting and including 
the identification of the output biases was made with the additional 
feature of retaining only the diagonal elements of the sample covariance 
for the estimation of the instruction noise variances. All the off-diagonal 
terms were set to zero. The rationale for this was that each of the 
measuring instruments on board the aircraft operate independently and there- 
fore the errors would be uncorrelated. The fit to the observed data did 
not improve over the previous run and many of the parameters were now 
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farther removed from the wind tunnel values. 


No additional processings with varied a priori weights were made, 
since sufficient data by which these variations could be justified were 
not available. 

5.2.8 Parameter Estimates Use d for Prediction 

It has been often stated that using a set of estimates for the stability 
and control derivatives to predict the measurements from a flight test, 
under similar conditions and with similar instruments as the one used to 
identify the derivatives, would be the most valid test of the accuracy of 
the parameter estimates. Since another set of flight data for the HL-10, 
under similar conditions, was not available, an experiment was run in which 
only the first 227 points of data were used to identify the parameter 
estimates and these results were used to predict the final 100 points (2 
seconds) of data. The identification algorithm which was used included 
the a priori weighting and the identification of the output biases. As 
the fits to the observed data, given in Figure 5.11, indicate, there is 
some divergence at the end, especially for r. However, the divergence 
in 8 and <}> was anticipated since the observed data suffers heavily from 
clipping during the final 1 second. The fit to the lateral acceleration 
was as good as might be expected considering the fit to the first 227 
points# 

5.3 M2/F3 Flight Test Data 

The data supplied to Systems Control, Inc. on flight No. 21, case 6 
of the M2/F3 lifting body is shown in Figure 5.12. The influence of 
wind gusts is evident in the time histories of the sideslip angle and the 
lateral acceleration. Referring to Section 5.2.1, the wind gusts were 
assumed to enter the dynamical equations of motion in exactly the same 
manner as the sideslip angle 8 . Nothing was known, a priori, about the 
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FIGURE 5.12 (CONT.) M2/F3: OBSERVED DATA AND CONTROL SEQUENCE TIME HISTORIES 





statistics (correlation time, mean square value) of the wind gusts. 

The only information supplied SCI was that the output error program used 
by FRC had failed to match the time histories adequately. A total of 
401 data points were supplied, representing 8.02 seconds of data. Once 
again, the effects of the instrumentation (quantization, clipping) were 
ignored as were the dynamics of the boom which measures sideslip angle 
(3 vane). It was also interesting to note that the flight conditions 
were appreciably different than for the HL-10. The angle of attack was 
only 1.57° and the Mach number was .468. 

Seven separate runs were made with the M2/F3 data, indicating a 
succession of possible model representations for the equations of motion. 
Since neither the measurement noise nor wind gust statistics were known 
a priori, these were included, where called for, in the list of parameters 

to be identified, along with the stability and control derivatives and 
initial conditions. 

5 * 3,1 Output Error - No Wind Gusts Included 

The maximum likelihood algorithm, in the output error mode, with the 
wind gusts assumed zero, was first used in trying to process the M2/F3 
flight data. It was intended that from such a run, it would become 
apparent where the wind gusts were having the most impact and also the 
results would serve as a standard against which the identification algorithm 
performance with the wind gusts included, could be measured. 

The time histories of the fit in each of the five measurements are 
given in Figures 5.13. As these figures indicate, the worst fits were 
obtained on the sideslip angle and lateral acceleration measurements, 
although none of the fits were as good as with the HL-10 data. These 
results also indicated that the model for including the wind gusts, suggested 
in Section 5.3.3 is appropriate, since the measurements involving the 
sideslip angle show the most random fluctuation when compared to the data 
from the HL-10 flight. 
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Fig. 5.13(cont'd) M2/F3: OBSERVATIONS AND ESTIMATES -OUTPUT ERROR 
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The parameter values obtained for this processing of the data are 
given in Table 5.4* 

5,3,2 Perfect Measurement of Sidesl ip Angle 

For this processing of the M2/F3 data, it was assumed that the measure- 
ment noise on the sideslip angle measurement is much smaller than the gust 
noise. With this assumption there is a perfect correlation between the process 
noise and the sideslip angle measurement noise, both being The state and 

sideslip angle measurement equations now appear as 

Cx “ F x + Gu + T 

y 3 = B + 3 n + n 3 £ 3 +3 n 

The Kalman filter for the complete four state, five output model must account 
for the perfect correlation. The most direct method for doing this is to 
first construct all equivalent four state model which is uncorrelated with the 
sideslip angle measurement. This is done by adding the quantity y^ 3 ^ n » 
which has value zero, to the dynamics, i.e., 

cx = Fx + Gu + re n + e (y 3 -e-P n ) 

nnd solving for 6 such that E { (rB n - 8„ T } - «. « " r is b “ 

the solution and the equivalent model has the resulting from 


cx = Fx + Gu + r(y 3 ~e) 

This equation is in the form of the Kalman filter, and it can further be shown 
that is the exact Kalman gain. Since is the third column of F, the 
dependence of x on 6 is eliminated and the equations of motion and the measure 

ment equations can be rewritten as 
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TABLE 5.4 M2/F3 PARAMETER ESTIMATES AND STANDARD DEVIATIONS 


Parameter 

Wind 

Max. Lik. 

Estimate 

Max. Llk Estimate 

Max. lik. 

Estimate 1 

Max. lik. Estimates 

llk. estimates 

Tunnel 
& Theo- 
retical 

-output error mode 
(with St'd dev.) 

Assuming perf. 8 Meas. 
(with St 'd dev. ) 

Directly ident. of 8 
(with St'd dev.) n 

With a priori weighting 
(with St'd dev.) 

with dependent paraas. 
fixed, fixed. 

(with St'd dev.) 

l p 

-<5.4 673 

-1.548 

(0.0935) 

0.679 

(0.033) 

-1.779 

(0.214) 

-0.461 

(0,0182) 

A 

L 

r 

0.8878 

2.008 

(1.187) 

10.49 

(0.547) 

25.46 

(1.908) 

4.154 

(0.140) 

A 

L s 

■75.140 

-54.49 

(2.45) 

-97.79 

(1.615) 

-135.38 

(2.238) 

-67.95 

(1.02) 

A 

S 

P 

.0802 ‘ 

.102 

(0.006) 

-.0203 

(0.00393 

-.142 

(0.0147) 

.00475 

(0.00349) 

A 

N r 

-.6876 

-.0307 

(0.078) 

-1.675 

(0.0590) 

1.628 

(0.199) 

-.764 

(0.0134) 

A 

n b 

Y 

P 

7.5342 

* 

2.876 

(0.136) 

7.324 

(0.152) 

-9.890 

(0.349) 

6.763 

( .0876) 

4.435 (.113) 

A 

Y 

r 

Y * 

a 

-.2001 

-.0476 

(0.125) 

-1.249 

(.0597) 

-1.466 

(0.0386) 

-.202 

(0.00392) 

A 

-1.36 (.0594) 

L s . 

14.04 

14.82 

(0.301) 

9.804 

(0.109) 

16.022 

(0.3017) 

10.96 

(0.161) 

9.66 (.169) 

L Jr 

10.03 

73.97 

(8.59) 

-109.28 

(5.519) 

-157.130 

(17.88) 

-42.18 

(3.13) 

A 

L 

0 

0 

11.14 

(1.828) 

-10.46 

(0.328) 

.145 

(2.11) 

-.572 

(0.115) 

-9.004 (.141) 


.83 

.596 

( .0223) 

.719 

(.0104) 

2.128 

(0.0456) 

.762 

(0.111) 

.756 (.0134) 

N ir 

-4.06 

-12.874 

(0.578) 

6.844 

(0.643) 

-11.754 

(1.467) 

-4.37 

(0.106) 

A 

N 

0 

0 

-.345 

(0.121) 

.177 

(0.0357) 

.427 

(0.198) 

-.233 

(0.0433) 

-.00239 (.0320) 

Y 4. 

0 

-.00033 

(0.0151) 

-.0363 

(0.00669) 

-.0125 

(0.0068? 

-.0847 

(0.00867) 

-.0275 (.00634) 

Y dr 

0 

.0301 

(0.363) 

-.874 

(0.222) 

-.926 

(0.227) 

-1.932 

(0.286) 

A 

Y 

0 

0 

.0179 

(0.354) 

.378 

(0.0299) 

.295 

(0.0313 

-.0974 

(0.378) 

.456 (.0189) 

*blas 




-.281 

(0.0531) I 

2.936 

(0.0629) 

-6.01 

(0.0933) 

-.667 (.108) 

^initial 


3.807 

(0.521) 

.359 

(0.188) | 

1.657 

(0.852) 

4.846 

(0.296) 

-3.125 (.239) 

'initial 


-2.262 

(0.0785) 

-1.66 

(0.0280) 1 

-1.604 

(0.0556) 

-2.22 

(0.0453) 

-1.061 (.0597) 

^initial 


-.556 

(.0251) 

* 

1 

- .565 

(.0279) 

A 


A 

’initial 


-34.44 

(-175) 

-32.69 

(*138) 

-33.630 

(.0796) 

| 

-31.91 

(*223) 

-31.52 (.224) 

a 






-44.147 

(.856) 




q 






6.231 

(.107) 




initial 






2.937 

(.256) ! 




Likelihood 

Function 

Value 


-1502 

-2237 

-2038 

-1122 

-1051 



TABLE 5.4 (CONT'D) 


Parameter 

! Max. Lik. with 

Rank Deficient Solution 

L 

-.531 

(.0189) 

F 

L 

4.268 

(.144) 

r 



L e 

*103.35 

(.105) 

N 

.0397 

(.00682) 

P 

N 

-.989 

(.0672) 

r 



N 6 

Y 

7.568 

(.306) 

* 


P 



Y 

* 


r 



Y s 

L «a 

-1.19 

(.0590) 

10.25 

(.0845) 

L 6r 

-5.539 

(.0257) 

L o 

-10.89 

(.280) 

N «a 

.561 

(.0254) 

V 

-.512 

(.651) 

N 

.587 

(.0833) 

o 


Y 6a 

-.0360 

(.00660) 

Y «r 

-.737 

(.219) 

V 

.408 

(.0296) 

0 


^bias 

-.164 

(.0428) 

P 

-1.029 

(.139) 

initial 


r initial 

-2.054 

(.0552) 

^initial 

-31.576 

(.0705) 

Likelihood 

-1689 


Function Value 
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where y^ is treated as a deterministic control. The order of the dynamical 
system has been reduced to 3 and the number of measurements to 4. 

Once a complete set of parameters has been obtained for this reduced 
order system, the time history of 3 n can be recovered. This is important 
since the identification of the statistics of the wind gusts is also 

A 

possible using identification. Sideslip angle estimate 3 can be found by 
substituting the parameter values of the three state model into the original 
four state model and solving for its time history. Then subtracting 
from the sideslip angle measurement gives the time history of 3 r + n^ (n^ was 
originally assumed small). A first-order linear model of the form 
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aB + v R 
n p 


(y 3 - h - e n + 


where v is the process noise with covariance q, can be fit to this 
data and the time constant a, the process noise covariance q, and the 
covariance of the measurement noise n^ identified. 


The time histories of the fit to the four observations (not in- 
cluding sideslip angle) are given in Figs, 5,14. The parameter estimates 
along with the estimates of the process noise covariance and the (recip- 
rocal of the) time constant for the wind gust are given in Table 5.3. 

The time history of the wind gust (including the neglible measurement 

noise) is shown in Fig. 5.15. 


The fit in each of the four measurements is very good, although time 
histories of the fit error indicate that there is still the same sinu- 
soidal variation, especially in p, that was observed in the HL-10 fit 
errors. Only the fit error in the lateral acceleration, a^, approached 
being white noise, which is the indication of the best possible fit. The 
value of the covariance of the noise on the sideslip angle measurement was 
almost two orders of magnitude smaller than the process noise covariance 
which supports the original assumption of this run. One surprising result. 
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Fig. 5.14 M2/F3 : OBSERVATIONS AND ESTIMATES - KALMAN FILTER WITH z 



rfU RATE TIME 








however, is that the L p , N g , and L g parameters changed sign from the 
wind tunnel value. r r 

There appeared to be two principle reasons for the ML and L 

r & 

parameters having the wrong signs. The first was that the magnitude * 
of the aileron variation was much larger than the rudder variation, 
unlike in the HL-10 case. Since the effect of the controls is additive 
in determining p and r, there is an identif lability problem with respect 
to N g ^ and L g ^ . This is substantiated by the small values of the terms 

of the sensitivity matrix corresponding to the N g and L g parameters. 

°r r 

The second factor contributing to the incorrect signs is the opera- 
tion of the yaw damper. This causes a feedback loop which activates the 
rudder as a result of yaw rate. The time histories of r and 6 appear 
in phase, therefore, in the M2/F3 time histories. In such a situation, 
unless the control 6 r i s modeled as a linear combination of the states, 
there is a uniquesness problem as to whether the actual aircraft dynamics 
on the feedback loop is being identified. With the yaw rate and S in 
phase, the N g ^ parameter, at least, will appear with a positive sign. 

problem of incorrect signs was of major concern and was the 
motivating factor for many of the remaining processing of the M2/F3 
data. The experience with the HL-10 data indicated that construing 
those parameters with wrong signs to have the same signs as the wind 
tunnel values would not correct the problem. The solution had to lie 
either in a more complete aircraft model or in dealing directly with 
the numerical problems causing the incorrect signs. 
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5.3.3 Wind Gusts Included: Direct Identification of Process Noise-Covariance 

and Time Constant of Correlated Gusts 

For the third processing of the M2/F3, the gusts were included dir- 
rectly in the dynamical and measurement equations requiring that a full 
Kalman filter be used in the maximum likelihood identification algorithm 
in order to obtain the sensitivities. For this processing of the data, 
the model of the wind gusts obtained from the previous run was used 

& n 85 + v 6 (5 ‘ 3) 

where a is the reciprocal of the time constant and is an unknown 

disturbance with covariance q. Replacing i** the original system 

equations by equation (5.3) results in 



Note that an additional bias term has been added to the <J> equation. 

With t he inclusion of the covariance of v^ and the 6 n state, there are 

now 24 parameters to identify: 17 stability and control derivatives <Y p and 

Y are assumed zero), 5 initial conditions, . q and a. 
r 

The difficulty involved in setting up the identification algorithm 
in such a case is that both the measurement and process noise covariances, 

R and q respectively, are unknown. However, both R and q are needed in 
establishing the Kalman filter gain, which is assumed to be in steady state. 
To begin the identification, therefore, some initial estimates of both q 
and R are necessary. R is assumed to be the diagonal elements of the sample 
covariance matrix obtained from the output error method. An initial value 
for q is obtained from the results of the previous run. 


102 



Once the initial iteration is completed, the value of q is updated 
like any other parameter and the measurement noise covariance, R, is ob- 
tained from the sample covariance. This last fact is derived from the 
property that with defined as the Kalman estimate of the state at 

time t^ given data up to time t^ f 

N 

lim 1 Y 1 , T «_ 

N-» « l 2-1 (y i _Hx i | i -l “ Du i) (y i ~ Hx i | i _ 1 -Du i ) T = HP ss H T + R 

i=l 

where P gg is the steady state error covariance matrix, obtained from 
solving a discrete Ricatti equation. The above expression is only 
approximate for finite data lengths. 


The time histories of the observations and the estimates are 

given in Fig. 5.16, and the final parameter values are given in 

Table 5.4. Although the fits to the p, 3, $ and a^ measurements 

obtained from this run improved over those obtained from the output 

error method, they are not totally acceptable. It is interesting to 

note that the signs of the parameters L and L have retained the 

P r 

same sign as the wind tunnel values. 

The time histories in Fig. 5.16 also indicate that most of the 
fits to the observed data are biased. Inclusion of measurement biases 
in the list of parameters to be identified did not, however, 
improve the performance. 
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5.16 (CONT'D): DIRECT IDENTIFICATION OF WIND GUST MODEL 
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The reason the maximum likelihood algorithm with the Kalman filter 
cannot reduce the fit error any further is basically numerical. What 
has occurred is that the diagonal element of the measurement noise 
covariance matrix, R, associated with the measurement of the sideslip 
angle has become very small when compared to the covariance of the wind 
gust disturbance. Indeed, the wind gust itself is practically white. 

As a result, the measurement noise cannot be distinguished from the wind 
gust. The alternative was to restructure the model so that the measurements 
of 3 + S n , the total sideslip angle are perfect. This is precisely what 
was done in the previous processing of the data. 


5 . 3.4 Three State Model With A Prio ri Weighting 

The results of the two previous processings of the H2/F3 data 
indicated that the assumption of perfect measurements of the sides P 
angle was reasonable and produced the best fits to the data owever, 
as stated earlier, the three state model resulted in wrong signs 
many of the parameters. The first processing of the data in an a P 
to correct these Incorrect signs used the priori weighting techmqu . 
The same weights as for the HL-10 data were used for the 142 F3, ®“ U 

ment biases were included in the list of parameters to be identifie 
since the use of a priori weighting on the HL-10 data indicated 

foxr bias estimation. 
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As shown in Fig. 5.17 the fits to the observed data resulting 
from a priori weighting were quite poor, especially in roll angle 
and lateral acceleration, although the parameter estimates themselves, 
as given in Table 5.4, were quite close to the wind tunnel values. It 

is interesting to note, however, that still has a wrong sign. 

r 

5 * 3 * 5 Three State Model With Fixed Parameters 

The basic causes of the incorrect signs for some of the parameters 
are that either the sensitivity of the output to changes in that parameter 
are small, as indicated by a relatively small diagonal element in the 
information matrix, or that there is a correlation, with respect to the 
sensitivity, between two or more of the parameters being identified. 

Such a situation would be indicated by an off-diagonal element of the 
normalized information matrix being close to +1. If this were the case 
the correlated parameters could not be individually identified. 

Both these problems existed with the M2/F3 data. 

One technique which has been used for treating both these problems 
is to fix one or more of a set of parameters that are correlated. 

The results of the identification run with fixed parameters indicated 
(i) the convergence of the algorithm to the final set of parameters 
estimates is more rapid and monotonic than when all the parameters 
are being identified, and (ii) , the final fit to the observed data 
is degraded to a certain degree. This latter characteristic is due 
to the fact that the number of degrees of freedom (equal to the number 
of parameters to be identified) for fitting the observed data has been 
reduced. In comparing the value of the likelihood function or cost for 
two cases with different numbers of parameters being identified (measure- 
ment noise covariance R being identified in both cases) the comparison 
should be made between a corrected cost, given by 

f(N,kHn/R/ 
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where N is the number of data points, and k is the number of parameters 
being identified and f(N,k) is a monotonically increasing function of k 
which depends upon the objective of identification. For certain types of 
systems, Akaike (Ref. 38) has shown that the one-step ahead final prediction 
error using the identified model is given by 


J " N-k In I R J 
N+k 


A plot of J vs k for a typical model is shown in Fig. 5.18. The important 
thing to note is that the predictive qualities of a model do not improve 
monotonically with the number of parameters, even though the fit error 
decreases monotonically with the number of parameters 


It was indicated earlier that there was very little variation in 
the rudder during the M2/F3 flight, which would make identification of 
the « r derivatives very difficult. In fact, the identified values of 
the L and N derivatives with the three state model were physically 
unreasonable, feing opposite in sign from the wind tunnel values. J 
first processing of the M2/F3 data in this set of runs was therefore 
made with the rudder derivatives fired at the wind tunnel values, with 
measurement noise biases being included in theunknown parameter set 
The results indicated a strong correlation between L and N and 
almost all the other parameters, and a fairly poor fit to the data. 

ring the same parameters but including a priori weighting did not 
improve the performance. 


The second processing of the data -fn t-two 

g oi. me data m this series included fixed 

d N dprivaf-fir^p. -I -I _ 

6 


L P and „ r derivatives as well as fired L { and K( . . The results 
3wed onlv a cl-ioVn- „ r 


' <5 * ~ • xne results 

s owed only a slight improvement over the 'previous' processing, and a 

strong correlation still eristed between L, and L pi Np and and 
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FIG. 5.18 PERFORMANCE CRITERION AS A FUNCTION OF THE NUMBERS 
OF MODEL PARAMETERS 
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N p and y 0’ Zt appeared, at this point, that a problem of identif iability 

was developing between the measurement biases and the initial conditions. 

The next processing of the data therefore included fixed L , N , 6 der-f- 

p r r 

vatives and initial conditions. The fits to the observed data were still 

very bad and correlation still existed between L and L„, L and L , and N 
x, r p r o’ p 


After several more experiments it was decided that L , L , L N 

P r* 6’ p’ 

N r and the S r derivatives should all be fixed and the initial conditions 
be identified instead of the measurement biases. The results, shown 
in Fig. 5.19, were the best fits to the observed data obtained with 
technique of fixing parameters at the wind tunnel values. The 
values of the parameters which were identified were all of the same 
sign as the wind tunnel values, as shown in Table 5.4. 


5,3,6 Three State Model With Rank Deficient Solution 

The results of the data processing, using the technique of fixing 
selected parameters of a correlated set at fixed values, showed that the 
convergence rate was improved due to better conditioning of the informa- 
tion matrix. The basic reasons why the parameter fixing technique does not 
always work are: (i) the correlation is usually not simply between 

pairs of parameters, but may involve the entire set of unknown parameters, 
and (ii) it is not usually possible to correctly choose a set of para- 
meters that should be fixed and the values at which they should be fixed. 

It was decided at this point to investigate, more fully, the problem of 
possible correlation between more than just pairs of unknown parameters. 

The solution of this type of dependency problem is to find the directions 
in parameter space corresponding to combinations of parameters which 
cannot be identified. A perfect dependency among the parameters would, 
strictly speaking, result in a zero eigenvalue of the information matrix, 
causing it to be singular. However, since round-off and other numerical 
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— observed data 

FIG. 5.19 M2/F3 TIME HISTORIES WITH DEPENDENT PARAMETERS AT + + + + estimate 

FIXED VALUES 




errors prevent the information matrix from being exactly singular, all the 
eigenvalues will be non-zero with a spread between the smallest and 
largest eigenvalue being many orders of magnitude. In such a case it is 
better to use a rank deficient solution for the inverse rather than a 
full rank solution. That is, the inverse to the information matrix should 
be computed leaving out one or more of the smallest eigenvalues. 


Ihe reason for neglecting the smallest eigenvalues can be explained 
in terms of the parameter step. The eigenvalues of the information matrix 
are the dimensions of the uncertainty e llipsoid , associated with the 
parameter estimates, in parameter space. The smaller eigenvalues indi- 
cating a larger dimension and therefore more uncertainty. Since the 
LxL information matrix M can be expressed in terms of its eigenvalues 
and eigenvectors, X A and V ± , i = 1, . . , L 

L 


M = 


L=1 


x i v i v i 


the parameter step is given by (see Section 4.3) 

L 

Ap = M" 1 (DJ) = X) X _1 V V T 

L-l 1 1 1 


where DJ is the gradient of the likelihood function. Therefore, the smaller 
eigenvalues also contribute the largest proportions to the parameter step. 

This implies that the largest components of the parameter step are in the direct- 
ion of the most parameter uncertainty. Therefore, the information matrix 


inverse is computed neglecting a certain number of the smaller eigenvalues, 


i.e. 



L-K 

e. 

i=l 




Each eigenvalue which is left out relates to a singular direction in 
parameter space, and, therefore indicates a combination of parameters which 
cannot be identified uniquely. Rather than fix the value of one or more of 
the parameters, as was necessary with the a priori weighting technique, the 
rank deficient solution fixes combinations of parameters corresponding to 
nearly zero eigenvalues. Thus the dimension of the space in which the set of 
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parameter values that minimizes the cost are sought, is reduced. It is important 
to realize, however, that at each iteration the values for all the specified un- 
known parameters are assigned updated estimates. 

The number of eignevalues to be neglected depends on which order 
rank deficient solution produces a parameter step resulting in a set of 
parameters with the lowest associated cost.. The procedures is as follows. 

Starting from some inimimum number of eigenvalues (10 in the HL-10 case), 
a reduced rank inverse is computed, the parameter step is determined and the 
associated cost evaluated. One more eignvalue is then added and the procedure 
repeated. This same thing is done until all the eigenvalues are added in, 
with the last inverse being a full rank inverse. The same procedure, starting 
from a minimum number of eigenvalues and progressing to the full rank, is 
repeated every iteration. 

The fits to the observed data are shown in Fig. 5.20* Comparing 
these with Fig. 5.14, it can be seen that the fits are only slightly 
degraded. The fits to roll angle and lateral acceleration are much 
better than those obtained for either a priori weighting or fixing 
of correlated parameters. The parameter values obtained for this third 
order rank deficient solution (3 eigenvalues neglected) are given in Table 
5.4. Although several of the parameter still have opposite signs from 
the wind tunnel values, many of them are much closer to the wind tunnel 
values and are more reasonable than the full- rank, 3 state parameter es- 
timates. Some, such as N g , now have the correct sign from physical 
considerations, where before they did not. It is clear that further 
development work on this rank-deficient solution approach will improve 
the estimates even more. 

It should be mentioned that the basic identifiability problem in the 
M2/F3 data is due to the stability augmentation system (SAS) providing 
all the sudden movement. The methods described above (viz. a priori 
weighting, fixing parameters and rank-deficient solutions) are indirect 
means of handling this problem. It would bemore exact to model the relevent 
characteristics of SAS in order to alleviate the problems . 
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FIGURE 5.20 M2/F3 TIME HISTORIES WITH RANK DEFICIENT SOLUTION 






VI 


BACKGROUND FOR LINEAR SYSTEM INPUT DESIGN 

The importance of choosing appropriate inputs (i.e., control surface 
deflections) for exciting specific modes of an aircraft or executing spe- 
cific maneuvers has been recognized for a long time* Several considerations 
which enter into the selection of inputs for an aircraft are: 

1. Pilot Acceptability - The inputs should be capable of being im- 
plemented easily by a pilot and the resulting response of the 
aircraft should not endanger pilot safety. 

2* Parameter Sensitivity - The measured response of the aircraft 

should be sensitive to the parameters that are being identified. 

This is necessary for obtaining good estimates of the parameters 
from the flight test data during the inverse computation or the 
identification process. 

3. Instrumentation Limitations - The dynamic range of the instruments 
and their signal-to-noise characteristics impose limitations on 
the types and magnitudes of aircraft maneuvers. The relationship 
between input design and instrumentation specification has been 
emphasized in (Ref. 40). 

4, Derivative Extraction Method - In the past, the choice of control 

inputs has often been dictated by the desire to use a particular 

method for derivative extraction. For example, sinusoidal inputs 

were used initially to obtain the transfer function of an aircraft 
at specified frequencies (Ref. 40). However, it was soon realized 
that this was very expensive in terms of the total flight test time 
required to obtain the aircraft stability and control derivatives. (Ref . 40) 
Next, the step and the doublet type of inputs were used and specialized 
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methods such as Prony's Method (Ref. 41) and the Time Vector Method (Ref. 1) 
were devised to extract derivatives. With the more powerful digital 
techniques available today such as the Newton-Raphson (Ref . 8) and the 
Maximum Likelihood Methods, (Ref. 33) arbitrary inputs can be handled 
and it is no longer necessary to limit the inputs for the success of 
the derivative extraction method. 


Modeling Assumptions - The six-degree-of-freedom equations of 
motion and the nonlinear aerodynamic model for an aircraft contain 
a large number of parameters (over 20C). The simultaneous esti- 
mation of all these parameters from a single maneuver is not 
attempted since this would lead to nonuniqueness and identifi- 
bilit y problems. Generally, linearized decoupled equations of 
motion are used for the extraction of longitudinal and lateral 
stability and control derivatives. The inputs selected for 
exciting these modes should be such that the assumptions of 
linearity and decoupling are not violated. The inputs currently 
in use are mostly of the doublet type. The resulting aircraft 
response is an impulse-type of response about a given trim 
condition. Generally no attempt is made to optimize the frequency, 
the shape, or the timing of the impulses in order to make the air- 
craft response sensitive to the parameters that are being identified. 

The motivation for the present study comes from a simulation 
of the X-22 VTOL Aircraft perfomed in 1970, and described in 

Section 5.1.4. T he multistep input gave parameter estimates which 
are an order of magnitude more accurate than the estimates obtained 
using the Cornell input. At about the same time, one of 
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the authors developed a general theory of optimal input design 
based on the techniques of modern control theory, (Ref. 42) In 
the following section, the salient features of the theory and 
the computation aspects of input design are presented. The 
results of applying it to the C— 8 aircraft parameter identification 
problem are considered in section VIII, 


6.1 Related Work on Input Design In System Iden tification 

The importance of input selection for system identification has also been 
recognized for a long time, though a unified mathematical treatment has emerged 
only recently. Some of the earlier attempts at input design were based on fre- 
quency domain methods and engineering judgment. A large amount of literature 
exists on Pseudo Random Binary Sequence (PRBS) inputs which have been found to 
provide improved identification for a large number of systems. (Ref. 43-45) However, 
PRBS inputs use very little information about the known properties of the system. 
Since in a number of physical systems some a priori information is available 
about the modes of the system (e.g., short period mode, phugoid mode, etc. of 
an aircraft's longitudinal motion), one can use this information to design inputs 
for identifying these modes more precisely. 

The work described in this report is most closely related to that of Aoki 
and Staley, (Ref. 46) Levadi, (Ref. 47) Nahi and Wallis, (Ref. 48) and Levin 
(Ref. 49) on input signal design for system identification and to that of McAulay 
(Ref. 50) and Esposito (Ref. 51) for signal synthesis. Aoki and Staley (Ref.. 46) 
consider single-input, single-output discrete-time systems. Levadi' s results 
(Ref. 47) are only applicable to the case in which the unknown parameters enter 
linearly in the system impulse response. Levine's results (Ref. 49) are applicable 
when linear regression is used to estimate the unknown parameters. 


The results presented here are applicable to multi-input, multi-output, 
continuous time systems. The computational algorithms proposed are new and 
and have not been used earlier for input design purposes. 
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VII 


THEORY OF INPUT DESIGN FOR LINEAR SYSTEM IDENTIFICATION* 

The problem of input design for linear system identification is formu- 
lated here as an optimal control problem. The performance criterion used 
is the sensitivity of the system response to the unknown parameters. The 
other criteria for input design such as pilot acceptability, instrumentation 
characteristics and state deviations described in Section VI are considered 
indirectly through an energy or power type of constraint on the input and 
through modifications of the final results. It is assumed that an output 
error or maximum likelihood method which can handle arbitrary inputs is used 
for derivative extraction. In these methods, the measured response of the 
system 2 (t) is expanded in terms of the system parameters as follows:. 

z(t) = y(e,t) + n(t) 

(7 • X, 

where y(t,6) is the true system response, 6 is an Nxl vector of unknown 
parameters and n(t) represents measurement noise. Let 0 be the best 
a priori guess of 0. By a Taylor series expansion of Equation (7.1), 

V 0 y(0 o’ (0 ~ 0 O ) + n(t) + higher order (7.2) 

terns 


z (t) = y(0 Q , t) + 

where V0 = ( — — -L_ 


in the output error method, the step (0-0 Q ) is determined by a least 
squares solution of Eq. (7.2) over the time interval [0,T]. It is intui- 
tively clear that the sensitivity function V 0 y(0 Q ,t) must be sufficiently 
large to achieve high accuracy in determining the parameter step (0-0 ) . 
Furthermore, the sensitivity functions must not be linearly dependent^ 
0 _< t _< T for a unique solution of (0-0 Q ). 


The theoretical aspects of this work were supported in part under AFOSR 
fir IT f ‘ F4462 °- 71 -C-0077. See "Dual Control and Identification Methods 
fication". C yStemS Part II » 0 P tlmal Input Design for Linear System Identi- 
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For input design, we now define a scalar performance index in terms 
of the sensitivity functions. For a single parameter, a suitable performance 
index is the weighted mean square value of the sensitivity function over 
the time interval [0,T]. 


J 


1 



dt 


(7.3) 


where is the variance of noise in the measurement y^. The perform 

ance index favors the measurements which are more accurate and makes 

them more sensitive to 0^ compared with the mesurements that are less 

accurate. It can also be shown that J 1 represents the inform- 
ation in the measured response about the parameter 0^ • 

For the multiparameter case, the choice of a scalar performance index 
is much more complicated due to the conflicting nature of the individual 
sensitivity functions. One possible choice is a weighted sum of the 
individual sensitivity measures. 


N 



j-1 


where the weighting coefficients w^ reflect the relative importance of 
parameters. It is clear that the criterion (7.4) can also 
be written as the weighted trace of the Fisher Information Matrix M. 

J = tr (WM) (7.5) 
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where W 


w. 


1 0 
W 2 


0 W N 




( 7 . 6 ) 


( 7 . 7 ) 


( 7 . 8 ) 


Other criteria such as the weighted trace of M the smallest 
eigenvalue of M or the determinant of M lead to nonlinear optimi- 
zation problems which are much more difficult to solve. 

An alternative method for choosing a criterion function is to con- 
sider the objective of identification. For example, if the identified 
parameters are to be used for control system design, the inputs should 
be selected so as to minimize the control error. Similarly, if the 
parameters are used for response prediction, the input design should be 
based on minimizing the mean square response error. However, these cri- 
teria generally lead to optimization problems which are mathematically 
intractable or extremely difficult to solve. It is also felt that the 
advantage gained by solving these difficult problems may be more than 
offset by the basic uncertainty about the initial parameter values 6 
The input design process described here should be viewed as a sequential 
design process in which the inputs are selected based on the current best 
knowledge of the parameters. 
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7.1 Problem Formulation 


Consider the linearized aircraft equations of motion 
x - Fx + Gu 


(7.9) 


where x is nxl state vector and u is a mxl control vector. F 
and G are n x n and n x m matrices of unknown parameters (stability an 
control derivatives) . The measurements are denoted by a p x 1 vector 
z(t) which is contaminated with white noise .n(t) 

z(t) = Hx(t) + n(t) . (7.10) 

H is a p x n matrix and n(t) is a zero mean Gaussian white noise process 

E(n> = 0, E{n(t)n T (T)> = R6(t - t) . < 7 - n > 


Let 0 denote the N x 1 vector of unknown parameters in the above 
equations. It is required to select the input (u(t), 0 _< t £ T} to 

maximize the weighted trace of the information matrix M subject to an 
energy type constraint. The information matrix M for the unknown parameter 

set 0 can be written as 


M 



(V Q x) T H T R _1 H(VgX) dt 


( 7 . 12 ) 


The energy constraint on u(t) is 



(7.13) 
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(7.14) 


The performance Index J can be written as, 

J = tr(WM} = t r {W 1/2 MW 1/2 } 

= tr {j (V Q x • W 1/2 ) T H T R _1 H(V 0 x • W 1/2 ) dt) 


In order to use this criterion, the quantity (V Q x) must be calculated 

as a function of time. A differential equation for (V 0 x) can be easily 

found from the system equation. Since the multiplication by W 1/2 represents 

a column operation, this is equivalent to calculating successive equations 
for 


1/2 9x 
*i 96. 


1,...,N 


This can be accomplished by the following: 


(« 


1/2 jbc_ 
30. 


> - <“" 2 $ r >- + < 2 tr > ♦ !?->■ 


with x = Fx + Gu. 


(7.15) 


An equivalent way of formulating the problem which makes use of the 
simultaneous computation of w^ 2 9x/90^ involves the specification of aug- 
mented F a , G a , and H a matrices and x A vector. These are defined as follows 
(with m the number of inputs and p the number of outputs) : 


F 

9F 

i 

J 
° 1 

w i 

90 

' F 1 '0 

1 I 1 


i. 

17 

* * 1 

\_ 1 I : 

W N 

90 

N 

I o | If 

li 1 J 


(N+l)nx (N-fl)n 


w 


9G 
1 90, 


w, 


9G 


N a0 N 
(N+l)nxm 


(7.16) 
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0 o 


Np x (N-fl)n 


0 „ 0 

— R. » — 

, » A 


Np x Np 


(7.17) 


X A W 2 36, 


C7.18) 


(N+l)n x 1 


With these 


definitions and using Eq. (92), it is possible to write: 


*A = Va + V 


(7.19) 


The performance criterion is now redefined as 


dt 

AAA A A 


( 7 . 20 ) 


In the next section, we obtain a set of necessary conditions for the 
optimal input using the Pontryagin’s Maximum Principle (Ref . 52). 
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7,2 O B tlm al Engergy-Cons tr a ined Input Using Maximum Principle 

Let A(t) denote the costate vector for * A ( t ) and y be the constant 

multiplier associated with the constraint (7.13). The Hamiltonian of the 
augmented system is. 


2 [_(x a )T H a T R a 1 H a (x a ) + U(u T u - |)] + A T [F a x a +G uj (7.21) 


the necessary conditions of optimality are 



or 


- fJa + hJ r; 1 H r 
A A A A A 


(7.22) 


and #e = 0 
u 


or 


u* 


1 

U 


(g a > T * 


(7.23) 


The boundary conditions are homogeneous. 


x A (0) = 0 , A(T) = 0 


Substituting for u* i„ Eq. 7.19, we obtuiu the two point boundary value 
problem. 


d_ 

dt 


X A 


A 



g a g a 


h I r A 




A 


(7.24) 
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Since the 


boundary conditions are homogeneous, the solution is trival 
E 0, X = 0, u = 0 except for certain values of y which are the 


viz. x a • , , 

eigenvalues of the two point boundary value problem. In other words the 

• 4 ii •- o (Ref 52) The eigenvalues and the optimal 
problem is of the Sturm-Liouville type. (Ret. 

ln put can be determined in a number of ways. Two possible methods are (i) 
the transition matrix method and (ii) the Riccati equation method. 

7.2.1 Transition Matrix Method 

Let ♦((.Oil.) denote the transition matrix of 7.24 for a particular W- 

1 „ T 


'Fa— - G. G. 
A y a a 


H a R a H A*' F a 


«I>(t,0;y) - exp I , 

\ H X 1 

Partition <t>(t,0;y) into X A and A parts as follows: 


(7.25) 


xx 


$ 


Ax 


xA 


AA 


(7.26) 


Then 


x.(t) 

A 


A(T) 


4> (T,0;y) 0 vl (T,0;y) 

XX 


4 , Ax (T, ° ;y) 


xA' 

^aa' 



X A(o) 


A(0) 


(7.27) 


quation in (7 . 27>along with the boundary conditions gives 


The second e 

A(T) - 4> xx (T,0:y) X(0) = 0 
For a nontrivial solution 
|$ xx (T f 0 ; y) | = 0 


(7.28) 


(7.29) 
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Eq.(/.29)i s the eigenvalue equation for the Hamiltonian system (7.24). It is 
a nonlinear algebraic equation in y and can be solved by a Newton-Raphson 
iteration. In general, there is an infinite set of eigenvalues, but we will 
be only interested in the largest value of y which will be shown to maximize 
J (Section 7.3). 

7.2.2 Riccati Equation Method 

The eigenvalues \i are functions of the interval length T. Therefore, 
one can fix y and determine T for which <J> AA (T,0;y) becomes singular. 
Another way is to use the Riccati matrix P(t) defined by the relationship 

* A (t) = P(t)A(t) (7.30) 


An equation for P(t) is obtained by differentiating both sides of Eq. (7.30) 
and substituting from Eq. (7.24). 

# 

• * 

x A = PA + PA 


or 


ip a p - 


n - ip + 


PR I 8 l\ p - PF > 


or 


T -1 


F,P + p F - PH . R . HP - — G G . 
A A AAA yAA 


(7.31) 


P(0) = 0. 

(7.32) 

The Riccati Eq. (7.31) differs from the usual Riccati— equation of the 
Linear-Quadratic problem in that the forcing term (last term) in Eq. (7.31) 
enters negatively. Eq. (7.30) can also be written as 
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X(t) = p" 1 (t) x A (t) 

whenever p" 1 exists. At final time t = T, since A(T) = 0, 

p”^(T) “ 0. (7.33) 


which means that a conjugate point exists at t = T. 

Eq. (7.33) provides us with a method to obtain the critical interval length 
T corresponding to an eigenvalue y. The Riccati Eq. (7.31) is integrated 
forward in time for a particular y using initial conditions (7 . 32) . When 
the elements of P(t) become very large, the critical length T correspond- 
ing to an eigenvalue is being reached. Now P'V) is integrated using the 

equation 


(p -1 ) = - p -1 p p 1 
dt v ' 


p -l (7.34) 


go to zero. 

U folio.. from the Sturmian propertyftef. 52) that the smallest T corresponds 
to the largest eigenvalue y. 

After the critical length T corresponding to the largest value of p 
has been determined, E,.0-24)is solved forvard in time using X(0) obtained 
from Eq, (2* 28) and (7. 29) as an eigenvector of t u (T,0;U> corresponding to 


or 


^ (p- 1 ) = - P -1 F A - f a p -1 + h a R^h a + £ P" 1 G a G / 


At: 


the critical interval length T, all the elements of P 


-1 
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the zero eigenvalue. Thereby, the boundary condition X(T) =0 is auto- 
matically satisfied. A unique value of X(0) is found by using the normal- 
ization condition of Eq. (7.13). 


7 ‘ 3 Application of Functional Analvc-fc 

✓ 

In the last section, the optimal input u was characterized in terms of 
the solution to a two point boundary value problem. In this section, we 
show that the optimal u is an eigenfunction of a positive self-adjoint 
operator corresponding to the largest eigenvalue p. 


Let A denote the operator corresponding to Eq. (7.19) viz. 

f* 

A[i 


w -/ 


e F i e - T) 


G. u(T)dx 
A 


(7.35) 


Let A*[ • ] denote the adjoint operator to A[*] 


A*[w] <= G 


/ 


T p T. 
e F A (s-t) 


w(s)ds 


Let <u,w> denote the inner product 
T . 




<U,W> *= / u (t)\*(t)dt 


(7.36) 


(7.37) 


The performance index J can be written 


as 


J '" < - X A "aWv 

■ <A “- h aY h a A u> 

- <u - 


(7.38) 


129 



The energy constraint of Eq. (7.13) is written as 


<u,u> ** E 


J is maximized subject to the above constraint by 


It is well known that 

U* which is an eigenfunction corresponding to the largest eigenvalue of the 
operator aV^V* Furthers, since A*hV\a is a positive self- 
adjoint operator, all its eigenvalues are real and positive Re 
T, the operator is also compact and has a finite maximum eigenvalue. The 
optimal u* is the eigenfunction of A*H^\a corresponding to this eigen- 


value and normalized according to <u,u> = E. 


a*hIra 1 «a a u 


yu 


(7.39) 


Also, 

Max J = y E 
u 


(7.40) 


To show the relationship of the above eigenvalue problem with the two 
point boundary value of Eq. (7.24), define 


(7.41) 

z *= Au 


and 


Then, 


n . r a t <-> 


ds 


using the definition of A and 


z « F z + G* u , z (0) 55 0 
A A 

n - - F^n + «XV’ n(T) 


* 

A , 


0 


(7.42) 


(7.43) 

(7.44) 
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From Eq. (7.39) and (7.42) 


G A T n c A * H XV U = y u 


(7.45) 


or 


1 

y 


x r A 

g a n 


(7.46) 


Therefore 


n 


r A z - y Va*" > z <°> ' 0 


" F A n + h a r ~a h a z * n(T) - o 


(7.47) 


A comparison of Eq. (7.46) and (7.47) with Eq. (7.24) shows that 


z » x. 


n a. 


7.4 Examples 

We now apply the above results to two first order and a second order 
example. 

7 -4.1 First Order System with Unknown Gain 
Consider the system 


x = -x + 0u 


(7.48) 


where x and u are scalars and 6 is the unknown gain. 


y = x + v , 


(7.49) 


where E{v) = 0, E{v(t)v(r)} = r <5(t - t). 
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From Eq. (7.23) and (7.24) 



(7.50) 

(7.51) 




V Q x(0) = 0=> CjW - C 2 = 0 

X(T) = 0 => C^[sin wT + w cos VT] - 0 
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For nontrivial solution, 


tan wT = -w 


(7.54) 


Eq.(7.54) is the eigenvalue equation. The smallest root w of Eq.(7.54) 

corresponds to the largest value of y. The optimal input u is°obtained from 
Eq. (7.50) as 

* C 1 

u = [sin W Q t + w Q cos w o t) (7>55) 

where p q = l/r(l + w^). Notice that u satisfies the same second order dif- 

ferentia! equation as X viz. Eq.(7.52). C is determined from the condition 

/ u dt = E. 
o 


This gives 


'i = \fi 


T/2 - 1/4 sin 2T 


(7 . 56) 


7-4*2 Levadi*s Example 

Levadi considers the following example in his paper (Ref. 47). 


x = -x + bu 


y = x + v 


where v is a correlated noise process with autocorrelation function. 
E{v(t)v(x)} = c e -a l t-T l 


(7.57) 

(7.58) 
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It is required to estimate b only. 


Levadi's (Ref. 47) results can be easily derived as follows: 
v(t) can be represented as 

, 0 (7.59 

v *= — av + e * 

yhere e is a white noise process and 
E{e} = 0 , E{e(t)e(x)} = 2ac 6(t - x) 

A new measurement can be generated by differentiating Eq.(7.58). (This 
procedure is similar to that of Bryson and Johansen, Ref. 53). 

y = x + v 

= (a - l)x + bu - ay + e 


Now y can be regarded as a new measurement which has white noise e in 
it. The new information matrix is: 

T 

M =■ I -TT- t(a - l)V,x + u] 2 dt (7.60) 

| 2ac d 

3 o 

where 


V. x = -V. x + u (/.b. 

b b 

Now the problem is in the same form as example 7 .4.1 except that the 
performance index is slightly different. 
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The following equations of optimality are easily derived. 


u «= - 


2 <* - 2 ^> 


[X - V X] 

ac b J 


J . - 0 . . , + u + , 

ac b ac 


An equation only in terms of u can also be obtained from 


Eq. 


5 - 11 - - - o 


The eigenvalue equation is 


(7.62) 


(7.63) 

(7.61) - (7.63) 

(7.64) 


tan(wT + 0) = — 
a 

2 

where w = [-1 + ~ — — 

2 ac - 1 J 

0 = tan ^ w 


(7.65) 

(7.66) 


The optimal input 


* 

u = A sin(wt + 0) 


(7.67) 


Notice that the results for example 7 .4.1 can be obtained by letting 

a and 2c/,a ■” r » wh ere 2c/a represents the area under the autocorrelation 
function of v. 

The optimum value of w is chosen to maximize p. From Eq. (7.66), 


P 


1 

2ac 


[1 + 


- 1, 


1 + 


w 


(7.68) 


It is seen from Eq .(7. 68) that when a 2 > 1, the maximum of p is attained 
for the smallest value of w. This corresponds to the case when the noise is 
wide-band. For the narrow band noise case viz. a 2 < 1, the second term in Eq. 
(7.68) i s negative and the maximum of p viz. l/2ac is reached at w = ». The 
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practical implication of this result is that the input should be of as high 
frequency as possible. Since the noise is narrow band, this increases the 
high frequency signal to noise ratio at the output. 


7.4.3 Second Order Example 

The following system represents the short-period longitudinal dynamics 
of an aircraft. 



(7.69) 


x represents pitch rate, x 2 is a linear combination of pitch rate and angle 

of-attack and 6 is the elevator deflection. The transfer function of the 
e 

system from elevator to pitch rate is 

X x ( s ) b i s + b 2 (7.70) 


Optimal Input for Identifying b^. 


Sensitivity equations are 




" a i r 


" 7 b^ 

+ 

”l 



_ a 2 °_ 


V b^2 


_0_ 


Optimal input, 6 g = - X^/2p 


h • X \ *1 ' Vl - Vz 


X 2 - -Aj 


(7.71) 


(7.72) 

(7.73) 
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Both the optimal input ^ and A,, satisfy fourth order linear differential 
equations of the form 

ID + " 2a 2 - a 2 )D 2 + a 2 ]A 2 = 0 (7.7^ 

where D denotes the differential operator d/dt. 


A solution for A 2 is easily written as 

X 2 (t) = C 1 sin w i t + c 2 cos V + C 3 sin w 2 t + C 4 cos 


where 


W X = 72 ~ ^ - 4a 

w 2 = Tf + ^ ~ 4a ' 


( Jr " 2a 2 “ a \> = + w 2 


Also w 3 w 2 = a 2> 


(7.75) 


(7.76) 


It is assumed here that o > 2a 2 or 1/pr > (4^ + a 2 ) since other cases 

ad to hyperbolic functions which become unbounded for large T. They are 

rejected as possible solutions using the same reasoning as used in examples 
/ .4.1 and 7.4.2. 


The expressions for A, (t) , Vu x 

1 N / * bjL ^ 

from Eq .(7. 75) using Eq.(7. 71)- (7. 73) 
is obtained as 


u(t) and V b2 x 2 are easily obtained 
The eigenvalue equation, assuming 4 ^ 
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sin WjT 
-w^ cos w^T 


" W l a l 


w. 


3 "1 

W 1 " W l a 2 " IS? 


cos WjT 

sin w^T 

. 2 
a 2 + W l 
2 

W l a l 


sin w^T 

cos w^T 

“ a l W 2 

u 

3 _2 

w 2 " W 2 a 2 " yr 


cos w^T 

sin w 2 T 

* . 2 
a 2 + w 2 

2 

W 2 a l 


0 (7.77) 


where w - - aj/Wj.. Equation(7.76)is used along with- (7.77) to select w x and w 2 
which maximize p. From Eq. (7.76) 

- <»1 + ^ + n • <»1 ' + ’I ' 

The minimum o£ 1/yr is attained at w* - -a 2 or “ x - “j* 1-e ” at the ' mda " ped 
natural frequency of the system. However since w 2 - “ 2 is ruled out by 
solution considered here, the root of Eq.(7.77) closest to the undamped natural 
frequency should be chosen. Since w^ - - 2 , the two frequencies Will bracket 

the natural frequency of the system. 


7 S 

The optimal input is 


6* = — ^ c i w i cos w l t ” C 2 W 1 Sin W l t + C 3 W 2 C ° S W 2 t C ^ W 2 Sln ‘ 7 * 78 ^ 

C . C C and c 4 are determined as the eigenvector of Eq. (7; 77) corresponding 
to the root u^. They are normalized using the condition 


£ 


6 dt = E 
e 


7,5 Extension to Systems wit h Process Noise 

Consider the linearized aircraft equations of motion 


x » Fx + Gu + Tn 
/ 

Z = Hx + v 


(7.79) 
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where n(t) is a gaussian white noise forcing function 
and modeling errors. 


representing random gusts 


E<n(t)} - 0 , E{n(t)n T (t)> - Q6(t - T ) 


The information matrix H in this case is given in terms of the Kalman filter 
for the above system. 


£ - Fx + Gu + K(y - H£) 
K = ZH T R _1 


(7.80) 

(7.81) 


FE + ZF T + TQr - ZH T R _1 HZ 


(7.82) 


where * denotes the best filtered estimate of x and I denotes the covariance 

of S. The Kalman filter provides a linear causally-invertible whitening 

transformation for the process y since the innovation sequence (y - H4) is a 

gaussian white noise sequence. The likelihood function is easily written 

in terms of the Innovation sequence (Refs. 24. 27). The information matrix M 
is given as 


f T 

M = j e <<V) T H T R -1 H(V e *)}dt 


(7.83) 


where V 0 * denotes the sensitivity function of the filtered estimate x with 
respect to the unknown parameter vector 0. Note that both K and Z are func- 
that the sensitivity equations are much more complicated than 
for the no process noise case. Moreover M, in general, depends on the random 
quantities „ and v so that its expected value needs to be maximized. 


A special case arises when 6 contains parameters from G only and the 
initial state is known exactly. Since K and P do not depend upon G, the 
sensitivity equation has a simple form 


V “ (F " KH > V + V 0 Gu 


(7.84) 
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K is, in general, time-varying, but if the system is completely controllable 

and observable, K reaches a constant steady-state value (Ref. 54). Then E, . ( 

, .. r r n Ffl (7.19) and most of the theory developed in Sections 
is essentially similar to Eq. v > 

7.2 and 7.3 carries over to this case. 


7 t 5 .]_ Example : Let us 


consider example 7.4.1 with additive process noise. 


x = -x + 0u + n 


= X + V 


(7.85) 


where 


E{r)} = o , E{n(t)n(x)) = q«(t - x) , x(0) - 0 

for 0 under steady-state filter gain, k > 0 is 

( 7 . 86 ) 


The filter sensitivity equation 

v * - -U + k)v a + u , v k(0) = o 
0 y 

Proceeding as in example 7.4.1, and defining 

„ . [i- - a + w 2 ) 1/2 


pr 


or P = p ,2, 

rta) + (1 + k) ] 


(7.87) 


it is seen that the optimal input u* obeys Eq. (7. 54)- (7. 56) . Notice that by 
increasing process noise q, the gain increases and p decreases. Thus the 
information M = pE for the same input energy E decreases. The frequencies 
u, however, remains unchanged. 
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7*6 State-Variable Constraints 


Linear state variable constraints can be handled either directly by 
adding a quadratic penalty function to the performance index or indirectly 
by adjusting the total input energy E. The examples 7.4.1-7.4.3 show that 
the amplitude of the input u is determined by E. Thus by adjusting E, the 
amplitude of the input u and the state x can be bounded. Of course, the 
inputs obtained in this fashion are not strictly optimal. 

7* 7 __Steps in Optimal Input Program 

As outlined in Sections 7.2 and 7.3, there are several possible com- 
putational techniques which can be used to solve for a consistent pair of 
interval length T and largest eigenvalue p^. The method that has been 
implemented in this program is to numerically find the first time instant at 
which the solution of the ricatti equation, relating x A (t) and X(t), becomes 
infinite. The instant at which this singularity occurs can be found with 
arbitrary accuracy by continually using a finer and finer step size and noting 
the instant at which the diagonal elements of the Riccati solution change sign.* 
This change in sign of the diagonal elements is one key indication that the 
ricatti solution has blown up through one direction (e.g., -“) and returned 
from the other (e.g. +«) # 

The complete flowchart of the computer program designed to calculate 
optimal inputs is given in Fig. 7.1. Many of the detailed steps have been 
combined into a single descriptive step since their description is beyond 
the scope of this report. For example, the actual technique used to inte- 
grate the ricatti equation is explained elsewhere (Ref, 55). 


The only instant in the computational algorithm where the theoretical 
development was not followed exactly was in the calculation of the eigen- 


The alternative technique of calculating the largest eigenvalue of the 
Hamiltonian system is much more difficult since the determinant of the 
transition matrix for the Hamiltonian system exhibits a very sharp zero 
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iterate until t « T 

MAX 

Fig. 7.1 FLOW-CHART OF OPTIMAL INPUT COMPUTER PROGRAM 
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vector of * (T,0;u) associated with the zero eigenvalue. The problem en- 

countered was basically numerical. Very seldom, if at all, would the matrix 
<f. xx (T,0;y) exhibit an exactly zero eigenvalue. This was caused, to a large 

extent, by the fact that T is never the exact instant of singularity of the 
ricatti solution. This difficulty was resolved by choosing A(0) such that 
the product <{> xx (T,0;p) X(0) had the smallest norm. If <J> xx (T,0;p) did, in- 
deed have a zero eigenvalue, this would be the associated eigenvalue and 
<|> x (0) = 0. The solution of this minimum norm problem is to choose A(0) 
to be the (normalized) eigenvector associated with the smallest eigenvalue 

of <(> T <j> . This was the technique incorporated into the optimal input 
XX XX 
program. 

Two additional items should be mentioned concerning the steps in the 
computational algorithm. The first involves specifying some additional fact 
about the eigenvector A(0) so that it can be uniquely specified. It was 
shown in Section 7.2 that the control at any time point, t, is a linear func- 
tional of A(t). Therefore, if a particular input energy is required, the 
norm of \(0) is scaled to the proper amount. 

The second item concerns the reconstruction of the M matrix as the last 
step in the computer algorithm. To just calculate the performance index, 
tr{WM}, it would be sufficient to use 

f T 

«<l lt I H I E A H A X A dt) ‘ 

J o 

However, the information matrix M, the Cramer-Rao lower bound M" 1 , and the 

det(M) give important information about the identification of the parameters 

which is not reflected in tr{WM}. For this reason, the augmented state 

1/2 

vector, x A , is rearranged to construct the matrix (V 0 x)W , which is used 
to compute the information matrix M. In addition, the eigenvalues of M and 
a f igure-of-merit tr{M} • tr{M } is also computed. 
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7. 8 Specialized Algorithms 


This section describes the two specialized algorithms which have been 
incorporated into the main program. These are: (i) the algorithm for com- 

puting the optimal set of weights when using the product of diagonal elements 

of M criterion, and (ii) the algorithm for computing p when the data 

max 

length, T, is specified* 

An Algorithm for Choosing Weights : 

As outlined in Appendix B, one purpose of the weighted trace is 
to obtain a closer approximation to the determinant of M as the performance 
criterion. This is done by bringing the ratio of the largest to the smallest 
diagonal element of M as close to 1 as possible. The computer algorithm 
takes the form of an iterative sequence of choosing weights, calculating the 
resulting M and then updating the weights and repeating. 


The formula for the updating of the weights, once an optimal input has 
been computed along with the accompanying information matrix, is as follows: 


w 

new^ 


w + enf- 
old. m 

l 


- w ] 
ii old i 


(7.88) 


* th 

where is the i diagonal element of H. Additional logic was subsequently 
added to the program to reduce the factor a by successive factors of 2 if the 
new set of weights failed to reduce the ratio of the largest to smallest 
diagonal element of M. This was made necessary since the equation (7.88) for 
updating the weights is by no means optimal* 


An Algorithm for Determining jj for a Given T- 
. max ' 

The second special algorithm built into the program enables a user to 

determine the U max for a specific length of data, T. The most direct, but 

costly, way of determining a value of u would be to pick several values 

max 

^max anc * rUn through the program, finding the associated values of T* 

These pairs (p f T) could then be used to construct a u 

max max 
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from this curve, the p for a desired T could be determined. However, this 

ID3X 

procedure would require a great many (h max » T ) pairs. A more direct method 
is to employ the optimal input program with associated zero crossing logic 
which would converge onto the correct W max - For the simple case of (p^,T^) 
and (p 2 ,T 2 ) as known associated pairs and where is the desired 

data length, the iterative equation for successive choices of 
P d is (y d = P 2 ) ( Ref - 56 )* 


< T i - T d> i 
- T x p. 


< T 2 - V 

T i - T 1 


(7.89) 


With a new, updated choice of p^, the program is run and an associated 
is found. This new pair is used in Eq. (7.89) to find an updated p, , and 
so on. Once the change in values for p d becomes smaller than some C, the 
procedure is stopped. For e of 1%, this procedure usually requires only 
four or five repetitions. Of course, if the value of T d is outside the 
range of the given initial pair (p^) and (p^), the logic of Eq. (7.89) 
is altered appropriately. 


Examples of both these specialized algorithms are given in Section VIII 
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VIII 


APPLICATION OF OPTIMAL INPUT DESIGN TO C-8 AIRCRAFT 

This section discusses the results of applying the optimal input 
design technique described in Section VII to a two and four state model 
of the longitudinal dynamics of a C-8 aircraft. In both cases the input 
was designed to optimize the identification of the five parameters associated 
with the short period mode. Additional constraints were put on the input 
in the form of limiting the maximum excursion of the angle of attack and 
pitch rate by adjusting the total energy of the optimal input. In the first 
part of this section the results of the two-state example are described, 
including a frequency domain analysis of the optimal input, a comparison 
of the input performance versus a subop timal doublet elevator input and 
use of the weighted trace criterion. The second part of this section describes 
the results of a Monte Carlo simulation of the identification process for 
the five short period parameters from simulated flight test data for both 
optimal input and a suboptimal doublet input. 

8*1 Short Period (Two-State) Longitudinal Dynamics of C-8 Aircraft 

8.1.1 Optimal Input for two-state model 

This investigation involved finding the optimal elevator deflections 
$ e to identify the parameters in the two dimensional short period longitudinal 
equations of motion for a C-8 aircraft. The state variables are the pitch 
rate q and angle-of-attack. a. The equations for the short period dynamics of 
the C-8 aircraft were 

■ i - 66 ] . 

0.005J e 

and the measurement equations were 


q 

sz 

-1.588 

-.562 


q 

+ 



_ 1 

- 0. 737_ 
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In determining the power spectral densities of n^ and n^, the values gi 
Ref. 13 (1 deg. /sec. error in q, and 2 in a) were multiplied by two 

times the correlation time of the noise sources, which is assumed to be 0.01 
secs. The measurement noise spectral density matrix is therefore given as 


0.02 


R 


0 


0 

0.04 


For this investigation the data length T was fixed at 4 secs. The 

appropr is te y was found from ay “T curve shown in fig* 8*1* 

rr r max max 

The value of y associated with a T of 4 secs is 0.015. The shape 
max 

of the y -T curve in Fig. 8*1 is characteristic of the general 
max 

relationship between these two variables. 


For the y and R values indicated above the optimal input 
max 

with the respect to the three parameters in the F matrix and the 
two parameters in G is given in Fig. 8*2. The energy of the input 
was 311 and tr{M} = 20,460. The check value of y max E was approxi- 
mately 20,200, indicating a numerical error of 0.1%. The determinant 
of M was computed to be 1 x 10 , with the ratio of the largest to 

smallest eigenvalue of M being almost three orders of magnitude. The 
eigenvalues themselves indicated that much of the relative uncertainty 
in the parameter estimates was concentrated in two of the five dimensions. 
The standard deviations of the parameter estimates were 


Standard 

Deviation 


0.167 0.0639 

Standard for Q = 

0.095 

0.03 5 _ 

• Deviation 

_0.025_ 


The time histories of the states a and q, resulting from the optimal 
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OPTIMAL INPUT 

Tr{M} = 20460 E “ 311 


OPTIMAL INPUT FOR SHORT PERIOD LONGITUDINAL DYNAMICS 

Fig. 3.2 - 
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input are shown in Fig. 8.3. Jhe energy of the input was constrained 

so that a does not exceed 10°. This method of energy limitation is 

the most direct way of applying state constraints; although, as mentioned 
in Section VII, the penalty function approach can also be used. 

g^2 Fourier Transform of the Op timal Input 

Since it was specified that the input be designed with respect to the 
parameters both in F and G, it would be reasonable to infer that the input • 
would have a low frequency component for identifying the parameters in G and 

a higher frequency component for identifying the parameters in F. The actual 
Fourier transform of the input signal is given in Fig. 8.4. The vertical 

scale has been reduced to 1/10 its actual height in order to illustrate 
the smaller variations. The DC component is 0.98. The small peak in the 
transform occurs at a frequency of 0.375 cyc/sec, which is close to the 
short period frequency of the aircraft. 

The most important point demonstrated by the frequency domain analysis 
is that the sinusoidal like component of the input signal occurred at a very 
specific frequency. It is well known that the maximum signal power can be 
obtained from a second order system if it is excited at its natural frequency. 
Therefore, in order to maximize the sensitivity of the output signal to the 
parameters in the F matrix (which is given by the M matrix) , or in other 
words, to maximize the component of the output signal due to the parameters 
in F, the input signal had a specific component set at the natural system 

frequency . 


8.1.3 rnm parison wit h a Doublet Input of Equal Duration and Ener gy, 

The third part of the investigation was to compare the performance 
of the optimal input to that of a doublet input of equal duration and 
energy. The doublet input used for this comparison is shown in Fig. 8.5 
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FOURIER TRANSFORM OF OPTIMAL INPUT 
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FIG. 8.5 SUBOPTIMAL DOUBLET INPUT 


The determinant of M was 7 x 10®, seven orders of magnitude less than 
for the optimal, while the trace of M was only 1600. The standard deviations 
of the parameter estimates were 


Standard 

Deviation 


for F 


.147 


.247 
.164 5 


Standard 

Deviation 


for G = 


.0631 

.0346 


Two of these values are smaller than those obtained for the optimal input, 
however the standard deviations in the parameter estimates are not explicitly 
in the optimization criterion. 


g.1.4 Effect of Small Parameter Value Changes on Optimal Input 

Since, in an operational application, the actual parameter values are 
not known, it was important to investigate the effect that changes in the 
parameter values had on the optimal input shape. These changes might reflect, 
for example, the difference between wind tunnel or theoretical estimates of 
an aircraft's stability and control derivative and the actual stability and 
control derivative values. It would be these estimates, however, that would 
be used to compute the optimal input for the identification of those same 
derivatives. The modified F matrix which was used in this investigation was 


154 



ELEVATOR 

DEFLECTION 

(DEG) 


.4 .8 1.2 1.6 2.0 2.4 2.8 . 3.2 



Fig. 8.6 - OPTIMAL INPUT FOR SYSTEM WITH 10Z PARAMETER VARIATION 
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-1,429 


-.613 


F ® 


1 -.663 


The optimal input resulting from this approximately 10% change in the 
parameter values is shown in Fig. 8.6. The most obvious difference is that 
the input length is now 3.21 seconds instead of 4.0 secs. Otherwise, the 

input has the same qualitative shape as the optimal input. The expected stan- 
dard deviations in the parameter estimates resulting from this input when 
it is applied to the original system are as follows: 


Standard 
Deviation 

Three of these values are very close to the standard deviations obtained for 
the optimal input, while the other two represent increases of approximately 
50%. The improvement over the subop timal input, however, still exists. 


.255 .0886 


.148 

.036 

Standard r „ 

; n . . . for G = 

Deviation 

.026 


8. 1 . 5 Weighted Trace Criterion : 

This part of the investigation involved using the weighted 

trace criterion to derive the optimal input and choosing the weights 

to make the diagonal elements of the information matrix equal. As 

detailed in Section VII the performance criterion is tr|WM[ where W 

is a diagonal matrix of weights chosen to set oj-m., = . . . = co m . 

1 11 p PP 

When all the diagonal elements of WM are equal, maximizing the trace 
of WM is equivalent to maximizing the product of the diagonal elements 
which is a better approximation to det(M) . 

Since the input which maximizes the performance criterion depends 
on the values of the weights, which in turn affect the input, an 
iterative scheme was used to update the weights until convergence was 

achieved. 
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Fig. 8.8 OPTIMAL STATE TIME HISTORIES FOR UNITY WEIGHTS 
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The optimal input and the state time histories for a T of 0.77 
sec and an energy of 62.61, and unity weights are given in Figs. 8.7 
and 8.8. The trace of M for this input was 611 and the determinant 
of M was 8.11 x 10 3 . The ratio of the largest to smallest eigenvalue 
of M was 1270. The eigenvalues themselves indicated that the parameter 
uncertainty was quite disproportionate along two of the eigenvector 
directions. The standard deviations in the parameter estimates are 
given below. 


Standard 

Deviation 


for F = 

2.03 

5.57 


Standard - 

for G = 

0.381 


— . 

1.43_ 


Deviation 

_0.129_ 


Notice the increase in standard deviations due to a shorter data length 
(0.77 sec. vs 4 sec.) and smaller input energy (62.6 vs 311). 

Using the weighted trace criterion, and 12 iterations to bring the 
ratio of the largest to the smallest element of WM down to 1.14, the 
optimal input and state time histories given in Fig. 8.9 were obtained. 
The determinant of M was calculated to be 2.27 x 10^ which is five 
times greater than the unity weights determined. The volume of the 
uncertainty ellipsoid decreased by the same factor. Another indication 
of this was the fact that the ratio of the largest to smallest 
eigenvalue of M was reduced by a factor of 2 to 640, with the largest 
eigenvalue itself being reduced by a factor of 2. The standard 
deviations for the parameter estimates are as follows: 


Standard 

Deviation 


1.56 4.04 

Standard 

0.304 

1 . 23_ 

’ Deviation for G = 

_0.122_ 


Notice on Fig. 8.8 that the parameters which are poorly estimated with unity 
weights were assigned higher weights. As a result, the lengths of 
the error ellipsoid along each of the axes has become more uniform and 
the total volume of the uncertainty ellipsoid has been decreased. 


159 




Fig. 8.9 OPTIMAL INPUT AND STATE TIME HISTORIES 


- WITH L'EK.IITED TRACE 



It is clear from this example that using a weighted trace criterion 
does result in an input which can, in an overall sense, identify the 
unknown parameters with improved accuracy. This improvement is measured 
by the increase in the value of the determinant of M. 

8 -2 C-8 Monte Carlo Simulation 

A more realistic test for verifying the advantages of the optimal 
input over a sub-optimal doublet input of equal energy was to perform a 
monte carlo simulation of the identification process using simulated flight 
test data. To make the simulation representative of an actual flight test 
the following were included. First, both phugoid and the short 
period modes in the longitudinal equations of motion were used 

to generate the data, although only the short period derivatives 
were to be identified. The optimal input had the characterlstlc 

of suppressing, as much as possible, the phugoid mode in order to maximize 
the sensitivity of the output to the values of the short period parameters. 
Second, short period parameters of the four state model that were used to 
compute the optimal input were changed by approximately 50% from the equi- 
valent parameter in the model that generated the data. In this way the 
situation of designing the input based on the wind tunnel values of the 
stability and control derivative was simulated. 

There were several criteria for comparing the performance of the 
optimal input with that of the suboptimal doublet input. Since the 
optimal input was computed on the basis of maximizing the trace of the 
expected information matrix, this is an obvious candidate. Maximizing 
the determinant of the information matrix, or equivalently minimizing the 
volume of the uncertainty ellipsoid, in parameter space, is another. How- 
ever, by the nature of a monte carlo simulation, the parameter estimates 
themselves can be used to calculate the sample covariance, and its trace, 
determinant or the eigenvalues can be used as performance criteria. 

Finally, histograms of the parameter estimates themselves and the 
associated probability distributions can also be used in determining 
input performance. 
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8.2.1 Optimal and Suboptimal Inputs for Monte Carlo S imulation 
The four state longitudinal equations of motion for the C-8 
aircraft which were used for computing the optimal input are 
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.125 

.125 
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ij 


.250 


is the discrete measurement noise covariance. The optimal input 
itself was designed with respect to enhancing the ability to identify 
the five short period parameters . given in the locations marked 
by an x 
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a nd Zg . Considering the 


These parameters are M^, M^, Z , M g 

amount of computer time needed“to ginerate°fhe optimal input for 
five parameters and four state variables, a 2 second data length 
was decided upon. The optimal input, for an energy of 58.0 is 


shown in Fig. 3 . 10 , along with the suboptimal doublet input, of 
equal energy, which was used for comparison. 


8 - 2.2 Generation of Simulated Flight Data 

The different stability and control derivatives which were used 

in the generation of the simulated flight data are illustrated 
below. 


Same J 

Same 

■ 


Same 

1 

Same I 

-2.238 

-.28 

G = 

-.829 

1 

1 

1. 

-.368 


-.0075 


random noise added to each of the four measurements were derived from 
a Gaussian random number generator. These noise sequences 
presented a slight problem since, with a 2 second data length and 

a .02 sampling period, the 100 samples were sometimes insufficient 
for the noise statistics to have the desired mean 

and covariance. However, since the performance of the optimal 
and suboptimal inputs were always compared for the same measurement 
noise sequence, the problem of incorrect noise statistics should be 
of minor importance for comparison purposes. 
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Fig. 8.10 - OPTIMAL AND SUBOPTIMAL INPUT FOR MONTE CARLO SIMULATION 
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Fig. 8.11 BLOCK DIAGRAM OF MONTE CARLO SIMULATION 
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It is interesting to note that since the weighting matrix is the 
sample error covariance, the value of the likelihood function will be 
determined completely by the log of the determinant of the sample error 
covariance. There are also two different methods for choosing the 
weighting matrix. One is that it should be considered a diagonal 
matrix since the measurement noise sources are all independent. The 
other is that, non zero off diagonal terms, be allowed in the weighting 
matrix. For this Monte Carlo simulation, the former method gave better 

results. In the latter case, the error covariance matrix had a tendency 
to become singular at times. 

For this application, none of the options discussed in Section 
4.6 , including parameter fixing, rank-deficient solution for M -1 or 
bounding the parameter estimates were used. Neither was it necessary 

during any of the iterations to cut the step size in order to improve 
the convergence properties. 


8,2,4 .Results o f Monte Carlo Simulation 

The monte carlo simulation consisted of 50 runs of the 
Identification procedure, with the parameter estimates, information 
matr« and its eigenvalues, and the parameter covariance matrix 
and its eigenvalues being saved at the end of each run. The ensemble 
results are tabulated in Table 8.1. 

The theoretical values of the trace of the information matrix, 
using the actual values for the stability and control derivatives 
were computed to he 2.12 x 10 7 and 4.74 x 10 5 for the optimal and’ 
suboptimal inputs respectively The average values, from Table 8.1, 
for the 50 runs were 2.15 x 10 7 and 4.79 x 10 5 , indicating that 
50 runs were sufficient for obtaining accurate parameter estimate 
and information matrix statistics. In addition, the trace of the 
sample covariance, computed from 

50 

Sample Covariance - £ (Ap -A?) <Ap ( -A?) T 

j=i J J 
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TABLE 8.1 MONTE CARLO RESULTS BASED ON IDENTIFICATION 
FOR 50 SETS OF SIMULATED DATA 


Trace of Sample Covariance 
Determinant of Sample Covariance 
Eigenvalues of Sample Covariance 


Parameter Standard Deviations 


Average Trace of Information 
Matrix 

Eigenvalues of Average 
Information Matrix 


Average Determinant of 
Information Matrix 

Average Trace of the 
Covariance Matrix 
(Cramer-Rao Lower Bound) 

Lower bound on parameter 
standard deviations (from 
Cramer-Rao Lower Bound) 


Optimal Input 

.242 


1.62 x 10 


-19 


Subop timal Input 

.315 


1.501 x 10 


-15 


.234 

.262 

* 

.725 x 10~ 2 

.514 x 10" 

.252 x 10~ 3 

.115 x 10 -2 
« 

.188 x l<f A 

.766 x 10 

.202 x IQ** 7 

.126 x 10~ 5 

.40? 

.307 

.295 

.491 

.00925 

.0235 

.0771 

.0537 

.000570 

.00168 

2.15 x 10 7 

4.79 x 10 5 

2.14 x 10 7 

4.79 x 10 5 

o 

2.95 x 10 4 

8.46 x 10 J 

A 

6.56 x 10 3 

4.77 x 10 

1 

1.39 x 10 2 

2.18 x 10 

1.12 

4.14 

k 

1 ft 

4.70 x 10 iO 

14 

1.95 x 10 


.182 


.312 


.351 

.234 

.00876 

.0665 

.000247 


.303 

.441 

.0311 

.0568 

.00262 
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where A Pj is the error in the parameter estimates for the j th run and Ap is its 
mean. As stated, it was found that the sample covariance matrix was fairly 
close to the Cramer-Rao lower bound. 

By almost all measures of performance the optimal input performed 
better than the suboptimal input. The determinant of the sample covariance, 
giving a measure of the overall parameter uncertainty based on the actual 
derived parameter estimates, was four orders of magnitude smaller for the 
optimal input. The eigenvalues of the sample covariance were smaller, 
on a one-to-one basis, for the optimal input, indicating a smaller dimension 
for each axis of the uncertainty ellipsoid. 

The histograms of parameter estimates are shown in Fig s . 8.12 - 8.16 
For the M a , Z q and parameters, the optimal input definitely pro- 
duced a better ensembfe of parameter estimates. The mean estimate 
value was much closer to the actual parameter value and the standard 
deviations and mean square errors were smaller. The performance for 
the two inputs was about the same for the M while the suboptimal input 
did outperform the optimal input on the fourth parameter. (M 6 ) However, 

it should be kept in mind that the accuracy of the parameter estimates 
themselves was not a direct performance objective. Rather, the overall 

input performance, as measured by the determinant or trace of the covariance 
matrix was the criterion of interest. 
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It is also interesting to note that the standard deviations for the 
parameter estimates obtained from the inverse of the information matrix 
(Cramer-Rao Lower Bound) did in fact bound the parameters standard deviations 
obtained from compiling the individual parameter estimates. For the optimal 
input, a comparison of these standard deviations Is given below: 


Lower Bound From 
Information Matrix Inverse 
(Cramer-Rao Lower Bound) 


Actual Standard Deviation 
of Parameter Estimates 


.351 

.234 

.00876 

.0665 

.000247 


.407 

.295 

.00925 

.0771 

.000570 


The histograms of the error in estimating two of the four components 
of the observation noise covariance are shown in Fig. 8.1 7 and 8.1 
errors in estimating the covariance are plotted rather than the covariance 
estimates themselves because the actual (sample) covariance of the noise 
varied from run to run due to the finite data length. As these histograms indi- 
cate, the error in estimating the 2nd component was consistently less than 5% 
for both the optimal and sub-optimal inputs and most often within 1% for the 
optimal input. For the third component, the error with the optimal input is 
consistently less than 7% and with the sub-optimal input, 9%. Overall, it is 
accurate to state that the performance of the optimal input in identifying the 
measurement noise covariances was only slightly improved over that of the sub- 


optimal input. 
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Fig. 8.14 PARAMETER ESTIMATE HISTOGRAMS FOR Z 
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Fig. 8.16 PARAMETER ESTIMATE HISTOGRAMS FOR Z. 
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8 .3 Optimal Input Through First-Orde r Filter 

One possible objection to the optimal inputs which are derived from 
the optimization technique is that they start from nonzero value which 
cannot be realized in practice. For example, in the case of the C-8 
Monte Carlo simulation, the initial elevator deflection for the optimal 
input was 10°. Although the suboptimal doublet input had an initial 
discontinuity of the same magnitude, a more realistic model of the 
situation would include the lag action of the control servos. This 
would prevent an instantaneous change in elevator deflection for both the 
optimal and suboptimal inputs. In addition, the optimal input, for 
identifying the short period parameters, has the disadvantage that the 
longitudinal velocity does not return to the nominal level after the 
input has been applied. For a nominal steady state value, the input 
must be two sided, i.e. both positive and negative values. 

This section compares the performance of a two sided optimal input, 
with a total energy of 101.25 and length of 4 seconds to the performance 
of a suboptimal doublet input, of equal energy. 

The 2 second optimal input which was used to generate the 4 - second 
input was derived for the C-8 four state longitudinal equations of motion 
given in section 8.2.1. A first order dynamical system with a time constant 
of .2 seconds and unity gain was used to simulate the control surface 
servo mechanism. 

The 4 second optimal input after passed through the model of the servo 
dynamics, appears as in Fig. 8.1 9 . Note that the actual control surface 

deflection has no discontinuities, starts at 0° and has no violent maneuvers. 
The suboptimal input, on the other hand, which is shown in Fig. 8*20 , 

while also beginning at 0°, entails some much more drastic deflections. 

Although a Monte Carlo indentif ication simulation was not performed 
for these two inputs, their performance, based on the charateristics of 
the information matrix, can still be compared. The determinant of the 
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TWO-SIDED SUB-OPTIMAL INPUT FROM OUTPUT OF FIRST ORDER SERVO 



information matrix, representing the reciprocal of the volume of the 
uncertainty ellipsoid, was 1.6 x 10 ® for the optimal input and 7.3 x 
10 for the suboptimal doublet input. Three out of five eigenvalues 
of the information for the optimal input were larger, on a one-to-one 
basis, than for the suboptimal input, and the trace of M was 1.4 x 10® 
versus 9.7 x 10®, with the one for the optimal input again being 
larger. 

In summary, the performance of the 4 second optimal input versus the 
suboptimal input including the effects of the making the input two-sided 
and the control surface servo delay, changed only slightly from that 
observed during the extensive Monte Carlo simulation. Further work 
along these lines is needed to make optimal inputs realizable in practice. 
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IX 


CONCLUSIONS 


In this report, the problem of identifying aircraft stability and 
control derivatives from flight test data was discussed. It was shown 
that the three important elements of the identification process are 

(i) the identification method, (ii) the design of control inputs, and (iii) the 
instrumentation. The first part of the report described various methods 
that have been used in the past for identification. The special character- 
istics and limitations of these methods were discussed. A new technique, 
based on the Maximum Likelihood Criterion, was then derived and discussed 
in detail. The technique is applicable to nonlinear models containing 
both process and measurement noise. As such, the Maximum Likelihood 
Method presented here is the most general technique that has been developed 
for identifying stability and control derivatives from flight test data. 

It consists of a combination of a Kalman filter (linear) or an Extended 
Kalman filter (nonlinear) for estimating the state and a Modified Newton- 
Raphson iterative procedure for estimating the parameters. 

The Maximum Likelihood Identification Method was applied successfully 
to three different problems : 

(i) Identification of parameters from simulated data for a nonlinear 
model of X-22 VTOL aircraft containing both process and measure- 
ment noise. 

(ii) HL-10 Lifting Body flight test data without gusts. 

(iii) M2/F3 Lifting Body flight test data containing gusts. 

The Output Error and Equation Error methods which were also tried 
gave either poor results or failed to converge on problems (i) and (iii) . 

Additional problems of identifiability were encountered in the processing 
of the HL-10 and M2/F3 flight test data. Those problems were manifested 
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as a) opposite signs on the parameter estimates compared to wind-tunnel 
and theoretical values, b) large parameter covariances, and c) high 
correlation between the parameter estimates. Three different methods were 
tried for alleviating these problems! 


(i) A priori weighting 
(ii) Fixing certain parameters 
and (iii) Rank-deficient solutions. 

The last method seems to hold the greatest promise for further 
developments. 

The second part of the report discusses the theory and the computation 
of optimal inputs for linear system identification. The criterion used 
for input design is the maximization of the output sensitivity to para- 
meter variations. This criterion is related to the Fisher Information 
matrix and the Cramer-Rao lower bound for the covariance of the parameter 
estimates. The objective of this effort was to determine the control surface 
deflections that enhance the effectiveness of identifying stability and 
control derivatives from flight test data. The specific application 
considered was the longitudinal dynamics of a C-8 aircraft. The optimal 
elevator surface deflection time histories were computed and compared with 
a doublet input of the same energy and time duration. A Monte Carlo simu- 
lation was done using the optimal and the doublet inputs and identifying 
short-period parameters from noisy output data. The optimal inputs were 

shown to provide more accurate estimates of aircraft parameters compared 
to a doublet input. 
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X 


areas for further investigation 


The following areas are proposed for further research on and 
practical application of the maximum likelihood Identification method 

and the design of optimal inputs: 

(i) Complete preflight simulation of the Integrated Aircraft 
Identification process for a future flight test program. 

(ii) Identification of stability and control derivatives from 
flight test data for specific important (e.g., high angle-of- 
attack , transonic) flight regimes. This would require the development 
of methods for model structure determination since the aero- 
dynamic models in such flight regimes are not well known. 

(iii) Correlation of the flight test results with the wind-tunnel 
results for high performance aircraft. 

(ly) Determination of the effect of SAS on the identif lability 
of the parameters. 

i i- e _r Rank— Deficient Method for solving 

(v) Further developments of the Kanic uem-ie. 

identif iability problems. 

(Vl) Extensions of the Input Design procedure to nonlinear models 
and process noise. 

(Vii) Modifications of the optimal inputs for pilot acceptability 
and ease of implementation. 

(viii) Flight Test validation of optimal inputs. 
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APPENDIX A 


EQUATION OF MOTION FOR X-22 VTOL 


The model for the equations of motion for the X-22 was designed 
to represent the longitudinal-vertical three degree of freedom motions 
of the vehicle in the body axis coordinate system. Defining 

u = change in velocity along the x axis from trim condition 

w = change in velocity along the z axis from turn condition 

0 = change in pitch angle (fuselage attitude relative to 

the horizontal) from the trim value 
q = change in pitch rate from zero trim value; since vehicle 

is restricted to longitudinal-vertical, wings level motion 
only, q = 0 

^es ~ c ^ an 8 e i n elevator stick deflection (positive 6 gives 

es ° 

positive pitching moment) from trim condition 
n x = accelerometer signal along the x axis 
n z = accelerometer signal along the z axis 


the nonlinear model, derived from an examination of wind tunnel 
data, is given by 
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X u Cu) X w (u) 0 



Z u (u) Z w (u) 0 


< 

to 

0 0 0 


1 V3 J 

H u (u) M w (u) M q (u) 


L J 

__ — 



The stability and control derivatives are all expressed as first 

order functions of u (e.g. M w = M w + M w ♦ u) except for M q (u), X q (u) and 

° u 

Z (u) which are second order functions. The derivatives of X y (u), Z y (u) 
o d_ 

and M u (u) are given as ( x c ( u ) + x w ( u ) * w + X 6es^ ^es^’ du ^ Z o^ 

+ V u)W + Z «es(u) • 6 es ) and £ (M q (u) + H w (u) . w + M q (u) * q + M fies (u) • 6 eg ) 

respectively andv v v r and v 3 are independent, zero mean white noise 
sequences with (diagonal matrix) covariance Q. There are therefore 
a total of 23 parameters to identify, enumerated as 



There are seven measurements being made of the aircraft state. These 
include the four state variables u,w,8 and q , the pitch acceleration, 
q, and the two accelerometer outputs n^ and n z . The equations for 
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the accelerometer outputs are 


1 * 

n x “ — (u + qw) + sin 0 

o 

n z ~ ( w ~ qu) - COS 0 

o 

• • 

Substituting in u and w, these expressions can be written as 




187 


APPENDIX B 


GRADIENT AND INFORMATION MATRIX CALCULATION WITH 
ADDED PARTIAL DERIVATIVE TERMS 


Since the presence of the process noise requires the use of a 
Kalman filter in computing the sensitivity functions, the exact 
equations for the gradient and information matrix will include the 
additional partials of the Kalman gain and state estimate error 
covariance with respect to the parameters. In those cases where 
the Kalman gain reaches a steady state, this steady state value can 
often be included as unknown parameters (with the only error being during 
the transient period). However, whenever the system equations are 
non-linear, as with the X-22 model, the Kalman filter will not reach 
a steady state and added partials will appear. 


Using the notation of Section 5.1, the gradient and information 
computations as well as the sensitivity function computations are given 
below. The additional terms included in thesejcalculations, due to the 

time varying Kalman gain, are indicated by a ^ j 

is used to denote the jth entry in the vector of unknown parameters. 


Gradient: 
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dPj N 
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Information Matrix! 
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Sensitivity Function: 

Between measurements: 
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